Porovnat metody
Prohlédněte si vybrané metody vedle sebe; řádky, které se liší, jsou zvýrazněny.
| Polo-řízené modelování témat× | Latent Dirichlet Allocation (LDA)× | |
|---|---|---|
| Obor≠ | Hluboké učení | Strojové učení |
| Rodina≠ | Machine learning | Latent structure |
| Rok vzniku≠ | 2009 | 2003 |
| Tvůrce≠ | Ramage, D.; Andrzejewski, D.; and related NLP community | Blei, D. M.; Ng, A. Y.; Jordan, M. I. |
| Typ≠ | Probabilistic graphical model (supervised/constrained extension of LDA) | Generative probabilistic topic model (three-level hierarchical Bayesian) |
| Původní zdroj≠ | Ramage, D., Hall, D., Nallapati, R., & Manning, C. D. (2009). Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 248–256. Association for Computational Linguistics. link ↗ | Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. DOI ↗ |
| Další názvy≠ | semi-supervised LDA, labeled LDA, seed-guided topic modeling, constrained topic model | LDA, topic model, Blei-Ng-Jordan model, probabilistic topic modeling |
| Příbuzné | 3 | 3 |
| Shrnutí≠ | Semi-supervised topic modeling extends unsupervised topic models such as LDA by incorporating partial human supervision — seed words, labeled documents, or must-link/cannot-link constraints — to steer discovered topics toward meaningful, domain-relevant categories while still exploiting the large unlabeled corpus for statistical strength. | Latent Dirichlet Allocation (LDA) is a generative probabilistic model for collections of discrete data, introduced by Blei, Ng, and Jordan in 2003. It treats each document as a mixture of latent topics and each topic as a probability distribution over words, enabling unsupervised discovery of thematic structure across large text corpora. It is one of the most cited papers in machine learning and natural language processing. |
| ScholarGateDatová sada ↗ |
|
|