So sánh phương pháp
Xem các phương pháp đã chọn cạnh nhau; những hàng khác biệt được làm nổi bật.
| CatBoost bán giám sát× | Tăng cường Gradient bán giám sát× | |
|---|---|---|
| Lĩnh vực | Học máy | Học máy |
| Họ | Machine learning | Machine learning |
| Năm ra đời≠ | 2018 (CatBoost); semi-supervised learning framework predates 2006 | 2006–2010s |
| Người khởi xướng≠ | Prokhorenkova et al. (CatBoost); semi-supervised paradigm from Chapelle et al. | Chapelle, Scholkopf & Zien (eds.); applied to GBM variants in subsequent literature |
| Loại≠ | Semi-supervised ensemble (gradient boosting) | Semi-supervised ensemble (self-training + gradient boosted trees) |
| Công trình gốc≠ | Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems (NeurIPS), 31. link ↗ | Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. Proceedings of ACL 1995, 189–196. (Foundational self-training framework underlying pseudo-label approaches.) link ↗ |
| Tên gọi khác | SSL CatBoost, semi-supervised gradient boosting with CatBoost, CatBoost with unlabeled data, pseudo-label CatBoost | pseudo-label gradient boosting, self-training GBM, semi-supervised GBT, label-propagation boosting |
| Liên quan≠ | 5 | 6 |
| Tóm tắt≠ | Semi-supervised CatBoost applies CatBoost's ordered gradient boosting framework to settings where only a fraction of training instances carry labels, leveraging unlabeled data through pseudo-labeling or consistency-based strategies to improve model accuracy beyond what labeled data alone would allow. | Semi-supervised gradient boosting combines gradient boosted trees with self-training or pseudo-labeling to exploit large pools of unlabeled data alongside a small labeled set. An initial GBM fit on labeled data assigns confident predictions to unlabeled examples; those pseudo-labeled points are folded back into training and the model is re-boosted, iterating until convergence. This allows practitioners to harness cheap unlabeled data when labels are scarce or expensive. |
| ScholarGateBộ dữ liệu ↗ |
|
|