主题模型

28 种方法属于此方法族。

精选

BERTopicBERTopic is a neural topic-modeling pipeline introduced by Maarten Grootendorst in 2022. It combines BERT-based contextual embeddings with UMAP dimensionality reduction and HDBSCAN 域自适应非负矩阵分解主题模型Domain-adaptive NMF Topic Modeling applies Non-negative Matrix Factorization to discover latent topics across text from multiple domains, using regularization or shared basis const 可解释的LDA主题模型Explainable LDA combines Latent Dirichlet Allocation — the canonical probabilistic topic model introduced by Blei, Ng, and Jordan in 2003 — with post-hoc and intrinsic interpretabi 可解释的非负矩阵分解主题模型An Explainable NMF Topic Model combines Non-negative Matrix Factorization — a parts-based decomposition of a document-term matrix — with explicit interpretability techniques such a 可解释主题建模Explainable Topic Modeling combines unsupervised topic discovery — such as LDA, NMF, or neural variants like BERTopic — with interpretability tools (top-word lists, coherence score 微调LDA主题模型Fine-Tuned LDA adapts a Latent Dirichlet Allocation model trained on a large general corpus to a specific target domain by continuing inference on domain-specific documents. Rather

阅读路径

本主题被引用最多的基础方法，按其提出的先后顺序排列——若您初次接触，不妨从这里开始。

NMF 主题模型1999作者：Lee, D. D. & Seung, H. S.
主题建模1999–2003作者：Hofmann, T. (pLSA, 1999); Blei, D. M., Ng, A. Y., & Jordan, M. I. (LDA, 2003)
潜在狄利克雷分配 (LDA)2003作者：Blei, D. M.; Ng, A. Y.; Jordan, M. I.
LDA主题模型2003作者：Blei, D. M., Ng, A. Y., & Jordan, M. I.
半监督LDA主题模型2009作者：Ramage, D.; Andrzejewski, D. et al.
半监督主题建模2009作者：Ramage, D.; Andrzejewski, D.; and related NLP community
基于NMF主题模型的迁移学习2010 (transfer learning survey); 1999 (NMF)作者：Pan, S. J. & Yang, Q. (transfer learning framework); Lee, D. D. & Seung, H. S. (NMF base)
BERTopic2022作者：Maarten Grootendorst

本栏架上的全部方法 ↓

自然语言处理中的更多内容

文本表示与预处理 135 自然语言处理任务 32 嵌入与语言模型 11