Fine-Tuned Doc2Vec adapts a pre-trained Paragraph Vector (Doc2Vec) model by continuing its training on a target corpus, producing document embeddings that capture both the general language knowledge of the original training and the vocabulary and style of the new domain. It is used for text classification, semantic similarity, and clustering when labeled data are scarce but unlabeled domain text is available.
Le, Q. V., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. Proceedings of the 31st International Conference on Machine Learning (ICML 2014), PMLR 32(2), 1188–1196. link ↗