Machine learningDeep learning / NLP / CV

Domain-adaptive BERT-based Classification

Domain-adaptive BERT-based classification extends the standard fine-tuning pipeline by first continuing BERT's masked-language-model pre-training on a large corpus of in-domain unlabeled text, then fine-tuning the adapted model on labeled examples for the target classification task. This two-stage approach closes the vocabulary and distributional gap between BERT's general pre-training corpus and specialized domains such as biomedicine, law, finance, or social-media text.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Gururangan, S., Marasovic, A., Swayamdipta, S., Lo, K., Beltagy, I., Downey, D., & Smith, N. A. (2020). Don't Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), 8342–8360. DOI: 10.18653/v1/2020.acl-main.740
  2. Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. (2020). BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4), 1234–1240. DOI: 10.1093/bioinformatics/btz682

Related methods

Referenced by

ScholarGateDomain-adaptive BERT-based Classification (Domain-Adaptive Pre-training with BERT for Text Classification). Retrieved 2026-06-04 from https://scholargate.app/en/deep-learning/domain-adaptive-bert-based-classification