Krahasoni metodat
Shqyrtoni metodat e zgjedhura krah për krah; rreshtat që ndryshojnë janë të theksuar.
| Transformues Vizual me Mbikëqyrje të Dobët× | Mësimi i Vetë-Mbikëqyrur× | |
|---|---|---|
| Fusha≠ | Mësimi i thellë | Mësimi i makinës |
| Familja | Machine learning | Machine learning |
| Viti i origjinës≠ | 2021–2022 | 2018–2020 |
| Krijuesi≠ | Dosovitskiy et al. (ViT); weak supervision paradigm from Zhou and others | LeCun, Y. and community (formalized ~2018–2020) |
| Lloji≠ | Self-attention image model with weakly supervised training | Representation learning paradigm |
| Burimi themelues≠ | Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR). link ↗ | LeCun, Y. & Misra, I. (2022). Self-supervised learning: The dark matter of intelligence. Meta AI Blog. https://ai.facebook.com/blog/self-supervised-learning-the-dark-matter-of-intelligence/ link ↗ |
| Emërtime të tjera | WS-ViT, weakly supervised ViT, weak supervision with vision transformer, ViT with weak labels | SSL, self-supervised pre-training, pretext-task learning, unsupervised representation learning |
| Të lidhura≠ | 4 | 3 |
| Përmbledhja≠ | Weakly Supervised Vision Transformer (WS-ViT) trains a Vision Transformer on image data that lacks precise pixel-level annotations, instead using cheaper, noisier supervision such as image-level class tags, bounding boxes, or web-scraped text. The global self-attention mechanism of the transformer makes it especially capable of localising objects and learning discriminative features from these incomplete labels. | Self-supervised learning (SSL) is a machine-learning paradigm that generates its own supervisory signal directly from unlabeled data by defining an auxiliary pretext task — such as predicting masked words, rotating images, or contrasting augmented views — and uses the learned representations as a powerful starting point for downstream tasks with minimal labeled examples. |
| ScholarGateSeti i të dhënave ↗ |
|
|