ScholarGate
助手
Machine learningDeep learning / NLP / CV

多模态Doc2Vec

多模态Doc2Vec将Doc2Vec段落向量框架扩展到包含来自多种模态的信息——通常是文本与图像、音频或结构化元数据结合——生成一个共享的文档级嵌入,同时捕获来自多个源的语义。它用于跨模态检索、多源分类以及仅靠文本不足以完成的文档表示。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

登录

Method map

The neighbourhood of related methods — select a node to explore.

来源

  1. Le, Q. V., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. Proceedings of the 31st International Conference on Machine Learning (ICML), PMLR 32(2), 1188–1196. link
  2. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., & Ng, A. Y. (2011). Multimodal Deep Learning. Proceedings of the 28th International Conference on Machine Learning (ICML), 689–696. link

如何引用本页

ScholarGate. (2026, June 3). Multimodal Doc2Vec (Paragraph Vector with Multi-Source Input). ScholarGate. https://scholargate.app/zh/deep-learning/multimodal-doc2vec

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

被引用于

ScholarGateMultimodal Doc2Vec (Multimodal Doc2Vec (Paragraph Vector with Multi-Source Input)). 于 2026-06-15 检索自 https://scholargate.app/zh/deep-learning/multimodal-doc2vec · 数据集: https://doi.org/10.5281/zenodo.20539026