So sánh phương pháp
Xem các phương pháp đã chọn cạnh nhau; những hàng khác biệt được làm nổi bật.
| CatBoost× | Gradient Boosting× | Hồi quy Huber× | |
|---|---|---|---|
| Lĩnh vực≠ | Học máy | Học máy | Thống kê |
| Họ≠ | Machine learning | Machine learning | Regression model |
| Năm ra đời≠ | 2018 | 2001 | 1964 |
| Người khởi xướng≠ | Prokhorenkova, L. et al. (Yandex) | Friedman, J. H. | Peter J. Huber |
| Loại≠ | Gradient boosting on decision trees | Ensemble (sequential boosting of decision trees) | Robust linear regression (M-estimation) |
| Công trình gốc≠ | Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V. & Gulin, A. (2018). CatBoost: Unbiased Boosting with Categorical Features. In NeurIPS 2018. DOI ↗ | Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, 29(5), 1189–1232. DOI ↗ | Huber, P. J. (1964). Robust Estimation of a Location Parameter. Annals of Mathematical Statistics, 35(1), 73-101. DOI ↗ |
| Tên gọi khác | CatBoost (Categorical Boosting), categorical boosting, ordered boosting, kategorik gradyan artırma | Gradient Boosting (GBM), GBM, gradient boosted trees, gradient boosting machine | Huber M-estimator, Huber loss regression, robust regression, Huber Regresyonu |
| Liên quan | 5 | 5 | 5 |
| Tóm tắt≠ | CatBoost is a gradient boosting algorithm, introduced by Prokhorenkova and colleagues at Yandex in 2018, that handles categorical variables natively and uses ordered target encoding to avoid label leakage. By building an additive ensemble of trees while permuting the data order at each iteration, it is often superior to XGBoost and LightGBM on category-heavy data. | Gradient Boosting is an ensemble learning method, formalised by Jerome H. Friedman in 2001, that combines a sequence of weak learners — typically shallow decision trees — so that each new tree is fitted to minimise the residual errors of the trees before it. It is the core algorithm behind popular implementations such as XGBoost, LightGBM and CatBoost. | Huber regression is a robust linear regression method, introduced by Peter J. Huber in 1964, that resists the influence of outliers by treating small and large residuals differently. It applies a squared (OLS-like) loss to small residuals and a milder absolute-value loss to large ones, so extreme observations cannot dominate the fit. |
| ScholarGateBộ dữ liệu ↗ |
|
|
|