So sánh phương pháp
Xem các phương pháp đã chọn cạnh nhau; những hàng khác biệt được làm nổi bật.
| Doc2Vec bán giám sát× | Word2Vec× | |
|---|---|---|
| Lĩnh vực≠ | Học sâu | Khai phá văn bản |
| Họ≠ | Machine learning | Process / pipeline |
| Năm ra đời≠ | 2014–2017 | 2013 |
| Người khởi xướng≠ | Le, Q. V. & Mikolov, T. (base Doc2Vec); semi-supervised extensions by various authors circa 2015–2019 | Tomas Mikolov et al. |
| Loại≠ | Semi-supervised representation learning | Neural word-embedding model |
| Công trình gốc≠ | Le, Q. V., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. Proceedings of the 31st International Conference on Machine Learning (ICML 2014), PMLR 32(2), 1188–1196. link ↗ | Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. link ↗ |
| Tên gọi khác | Semi-supervised Paragraph Vector, SS-Doc2Vec, Label-guided PV-DBOW, Semi-supervised PV-DM | word embeddings, skip-gram, continuous bag-of-words, Word2Vec Kelime Gömülmeleri |
| Liên quan≠ | 3 | 4 |
| Tóm tắt≠ | Semi-supervised Doc2Vec extends the Paragraph Vector framework of Le and Mikolov (2014) by training dense document embeddings on both labeled and unlabeled corpora simultaneously, using available class labels as an auxiliary signal to steer the representation toward task-relevant structure while still exploiting the full unlabeled collection for generalization. | Word2Vec is a neural word-embedding technique introduced by Mikolov and colleagues in 2013 that maps each word in a text corpus to a dense numeric vector. Words that appear in similar contexts end up close together in the vector space, so the embeddings capture semantic similarity that can be measured arithmetically. |
| ScholarGateBộ dữ liệu ↗ |
|
|