Machine learningDeep learning / NLP / CV
Multilingual Topic Modeling
Multilingual topic modeling extends probabilistic topic models such as LDA to corpora spanning two or more languages, inferring shared latent topics across language boundaries. By tying topic distributions across languages, it enables cross-lingual document analysis, comparable topic discovery, and information retrieval without requiring full parallel corpora.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Mimno, D., Wallach, H. M., Naradowsky, J., Smith, D. A., & McCallum, A. (2009). Polylingual topic models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 880–889. ACL. link ↗
- Vulić, I., De Smet, W., & Moens, M.-F. (2015). Monolingual and cross-lingual information retrieval models based on (bilingual) word embeddings. In Proceedings of SIGIR 2015, pp. 363–372. ACM. link ↗