Machine learningDeep learning / NLP / CV

Multilingual Topic Modeling

Multilingual topic modeling extends probabilistic topic models such as LDA to corpora spanning two or more languages, inferring shared latent topics across language boundaries. By tying topic distributions across languages, it enables cross-lingual document analysis, comparable topic discovery, and information retrieval without requiring full parallel corpora.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Mimno, D., Wallach, H. M., Naradowsky, J., Smith, D. A., & McCallum, A. (2009). Polylingual topic models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 880–889. ACL. link
  2. Vulić, I., De Smet, W., & Moens, M.-F. (2015). Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In Proceedings of SIGIR 2015, pp. 363–372. ACM. link

Related methods

ScholarGateMultilingual topic modeling (Multilingual Topic Modeling (Cross-lingual Latent Topic Inference)). Retrieved 2026-06-04 from https://scholargate.app/en/deep-learning/multilingual-topic-modeling