Machine learningDeep learning / NLP / CV

Domain-Adaptive Vision Transformer

Domain-Adaptive Vision Transformer (DA-ViT) applies domain adaptation techniques — such as adversarial alignment, self-training, or attention-level bridging — on top of a pretrained Vision Transformer backbone to transfer visual knowledge from a labeled source domain to an unlabeled or lightly labeled target domain, reducing the distribution shift that limits standard ViT fine-tuning.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations (ICLR). link
  2. Yang, L., Balaji, Y., Lim, S. N., & Shrivastava, A. (2023). TVT: Transferable Vision Transformer for Unsupervised Domain Adaptation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 520-530. link

Related methods

Referenced by

ScholarGateDomain-adaptive vision transformer (Domain-Adaptive Vision Transformer (DA-ViT)). Retrieved 2026-06-04 from https://scholargate.app/en/deep-learning/domain-adaptive-vision-transformer