ScholarGate
助手
Machine learning

CLIP — 对比语言图像预训练

CLIP(对比语言图像预训练)是 OpenAI 于 2021 年由 Radford 等人提出的一个视觉语言模型,它通过在 4 亿个互联网来源的图像-文本对上进行训练,使用对比目标来联合学习对齐的图像和文本表示,从而能够对图像分类任务进行零样本迁移,而无需任何特定任务的微调。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

登录

Method map

The neighbourhood of related methods — select a node to explore.

来源

  1. Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning Transferable Visual Models From Natural Language Supervision. Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 8748–8763. link
  2. Radford, A., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020. link
  3. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. ISBN: 978-0-262-03561-3

如何引用本页

ScholarGate. (2026, June 3). Contrastive Language-Image Pretraining. ScholarGate. https://scholargate.app/zh/deep-learning/clip

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

被引用于

ScholarGateCLIP (Contrastive Language-Image Pretraining). 于 2026-06-15 检索自 https://scholargate.app/zh/deep-learning/clip · 数据集: https://doi.org/10.5281/zenodo.20539026