Latent structure
Latent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA) is a generative probabilistic model for collections of discrete data, introduced by Blei, Ng, and Jordan in 2003. It treats each document as a mixture of latent topics and each topic as a probability distribution over words, enabling unsupervised discovery of thematic structure across large text corpora. It is one of the most cited papers in machine learning and natural language processing.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. DOI: 10.5555/944919.944937 ↗
- Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. DOI: 10.1145/2133806.2133826 ↗
- Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Ch. 9). Springer. ISBN: 978-0-387-31073-2