Process / pipeline
BERTopic — Neural Topic Modeling
BERTopic is a neural topic-modeling pipeline introduced by Maarten Grootendorst in 2022. It combines BERT-based contextual embeddings with UMAP dimensionality reduction and HDBSCAN clustering to produce coherent, dynamic topics, achieving higher topic coherence than classic topic models.
Open in MethodMindSoonVideoSoon
Read the full method
Members only
Sign inSign in with a free account to read this section.
Sources
- Grootendorst, M. (2022). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv:2203.05794. DOI: 10.48550/arXiv.2203.05794 ↗
- McInnes, L., Healy, J. & Astels, S. (2017). hdbscan: Hierarchical density based clustering. Journal of Open Source Software, 2(11), 205. DOI: 10.21105/joss.00205 ↗