ScholarGate
Msaidizi
Machine learning

CLIP — Upambanuzi wa Lugha-Picha wa Kulinganisha

CLIP (Contrastive Language-Image Pretraining) ni modeli ya maono-lugha iliyoanzishwa na Radford et al. katika OpenAI mwaka 2021 ambayo hujifunza kwa pamoja uwakilishi wa picha na maandishi kwa mafunzo kwenye jozi milioni 400 za picha-maandishi zilizochukuliwa kutoka mtandaoni kwa kutumia lengo la kulinganisha, kuwezesha uhamishaji wa sifuri-risasi kwa kazi za uainishaji wa picha bila marekebisho yoyote maalum kwa kazi.

Fungua katika MethodMindHivi karibuniVideoHivi karibuniDownload slides

Soma mbinu kamili

Kwa wanachama pekee

Ingia kwa akaunti ya bure ili kusoma sehemu hii.

Ingia

Method map

The neighbourhood of related methods — select a node to explore.

Vyanzo

  1. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 8748–8763. link
  2. Radford, A., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020. link
  3. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. ISBN: 978-0-262-03561-3

Jinsi ya kunukuu ukurasa huu

ScholarGate. (2026, June 3). Contrastive Language-Image Pretraining. ScholarGate. https://scholargate.app/sw/deep-learning/clip

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

Imerejelewa na

ScholarGateCLIP (Contrastive Language-Image Pretraining). Imepatikana 2026-06-15 kutoka https://scholargate.app/sw/deep-learning/clip · Seti ya data: https://doi.org/10.5281/zenodo.20539026