Machine learningDeep learning / NLP / CV
Self-supervised LDA Topic Model
Self-supervised LDA combines the probabilistic generative framework of Latent Dirichlet Allocation with self-supervised pretraining signals — such as masked-word prediction or contrastive document objectives — to guide topic discovery without requiring hand-labeled training data. The result is topic representations that are simultaneously grounded in distributional statistics and enriched by language structure learned from raw text.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022. link ↗
- Meng, Y., Huang, J., Zhang, Y., & Han, J. (2022). Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations. Proceedings of WWW 2022, ACM. DOI: 10.1145/3485447.3512034 ↗