CLIP — Upambanuzi wa Lugha-Picha wa Kulinganisha
CLIP (Contrastive Language-Image Pretraining) ni modeli ya maono-lugha iliyoanzishwa na Radford et al. katika OpenAI mwaka 2021 ambayo hujifunza kwa pamoja uwakilishi wa picha na maandishi kwa mafunzo kwenye jozi milioni 400 za picha-maandishi zilizochukuliwa kutoka mtandaoni kwa kutumia lengo la kulinganisha, kuwezesha uhamishaji wa sifuri-risasi kwa kazi za uainishaji wa picha bila marekebisho yoyote maalum kwa kazi.
Soma mbinu kamili
Ingia kwa akaunti ya bure ili kusoma sehemu hii.
Method map
The neighbourhood of related methods — select a node to explore.
Vyanzo
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 8748–8763. link ↗
- Radford, A., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020. link ↗
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. ISBN: 978-0-262-03561-3
Jinsi ya kunukuu ukurasa huu
ScholarGate. (2026, June 3). Contrastive Language-Image Pretraining. ScholarGate. https://scholargate.app/sw/deep-learning/clip
Which method?
Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.
- ResNet (Mtandao wa Mabaki)Ujifunzaji wa Kina↔ compare
- Transformer wa MaonoUjifunzaji wa Kina↔ compare
Imerejelewa na
Umeona tatizo kwenye ukurasa huu? Ripoti au pendekeza marekebisho →