Machine learningMachine learning

Semi-supervised Gradient Boosting

Semi-supervised gradient boosting combines gradient boosted trees with self-training or pseudo-labeling to exploit large pools of unlabeled data alongside a small labeled set. An initial GBM fit on labeled data assigns confident predictions to unlabeled examples; those pseudo-labeled points are folded back into training and the model is re-boosted, iterating until convergence. This allows practitioners to harness cheap unlabeled data when labels are scarce or expensive.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. Proceedings of ACL 1995, 189–196. (Foundational self-training framework underlying pseudo-label approaches.) link
  2. Chapelle, O., Scholkopf, B., & Zien, A. (Eds.) (2006). Semi-Supervised Learning. MIT Press. ISBN: 978-0-262-03358-9

Related methods

Referenced by

ScholarGateSemi-supervised Gradient Boosting (Semi-supervised Gradient Boosting (Self-training / Pseudo-labeling with Gradient Boosted Trees)). Retrieved 2026-06-04 from https://scholargate.app/en/machine-learning/semi-supervised-gradient-boosting