Machine learningDeep learning / NLP / CV

半监督视觉变换器

半监督视觉变换器 (Semi-supervised Vision Transformer) 将视觉变换器 (ViT) 的基于块 (patch-based) 的自注意力架构应用于仅有部分图像被标记的数据集，通过伪标签 (pseudo-labeling)、一致性正则化 (consistency regularization) 或自监督预训练任务 (self-supervised pretext tasks) 来利用大量未标记数据，最后在少量标记数据集上进行微调。该方法即使在标记图像稀缺的情况下也能达到接近监督学习的准确率。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

Method map

The neighbourhood of related methods — select a node to explore.

半监督视觉变换器

微调视觉Transformer 图像分类自监督视觉Transformer 半监督式BERT分类半监督卷积神经网络 Vision Transformer

来源

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations (ICLR 2021). link ↗
Zhai, X., Kolesnikov, A., Houlsby, N., & Beyer, L. (2022). Scaling Vision Transformers. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12104–12113. link ↗

如何引用本页

ScholarGate. (2026, June 3). Semi-supervised Vision Transformer (Semi-supervised ViT). ScholarGate. https://scholargate.app/zh/deep-learning/semi-supervised-vision-transformer

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side →

发现本页有问题？报告或提出修改建议 →

阅读完整方法

Method map

来源

如何引用本页

相关方法

Which method?