Módszerek összehasonlítása
Tekintse át a kiválasztott módszereket egymás mellett; az eltérő sorok kiemelve jelennek meg.
| Hiányzó adatok mechanizmusai: MCAR, MAR és MNAR× | EM-algoritmus× | MICE× | |
|---|---|---|---|
| Tudományterület | Statisztika | Statisztika | Statisztika |
| Módszercsalád≠ | Process / pipeline | Machine learning | Process / pipeline |
| Keletkezés éve≠ | 1976 | 1977 | 2011 |
| Megalkotó≠ | Donald Rubin | Dempster, Laird & Rubin | Stef van Buuren & Karin Groothuis-Oudshoorn |
| Típus≠ | Diagnostic / classification framework | Iterative optimization algorithm | Iterative multiple imputation algorithm |
| Alapmű≠ | Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592. DOI ↗ | Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1), 1–38. DOI ↗ | van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1–67. DOI ↗ |
| Alternatív nevek | Missing Data Typology, Rubin's Missing Data Framework, Missingness Mechanisms, Kayıp Veri Mekanizmaları | EM, Expectation-Maximization, Maximum Likelihood via Incomplete Data, BM Algoritması | Fully Conditional Specification, Sequential Regression Multivariate Imputation, Chained Equations Imputation, Zincirleme Denklemlerle Çoklu Atama |
| Kapcsolódó≠ | 3 | 2 | 3 |
| Összefoglaló≠ | Missing data mechanisms, introduced by Donald Rubin in 1976, provide a formal taxonomy for classifying why observations are absent from a dataset. The three categories — Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR) — describe the relationship between the probability of missingness and the observed or unobserved values. Identifying the correct mechanism is essential because it determines which analytical strategies preserve valid and unbiased inference. | The Expectation-Maximization (EM) algorithm is an iterative optimization procedure for finding maximum likelihood or maximum a posteriori estimates of parameters in statistical models with latent variables or missing data. Introduced by Dempster, Laird, and Rubin in their landmark 1977 paper, EM alternates between computing the expected complete-data log-likelihood (E-step) and maximizing it with respect to the parameters (M-step), guaranteeing monotone non-decreasing likelihood at each iteration. | Multivariate Imputation by Chained Equations (MICE) is an iterative procedure for handling missing data in multivariate datasets. Introduced by Stef van Buuren and Karin Groothuis-Oudshoorn through the R package mice (2011), the algorithm fills each missing variable using a separate regression model conditioned on all other variables, cycling through variables repeatedly until the imputed values converge. The result is m completed datasets that are analysed separately and combined using Rubin's rules. |
| ScholarGateAdatkészlet ↗ |
|
|
|