Porównaj metody
Przeglądaj wybrane metody obok siebie; wiersze, które się różnią, są wyróżnione.
| Mechanizmy brakujących danych: MCAR, MAR i MNAR× | Algorytm EM× | Uzupełnianie wielokrotne× | |
|---|---|---|---|
| Dziedzina | Statystyka | Statystyka | Statystyka |
| Rodzina≠ | Process / pipeline | Machine learning | Process / pipeline |
| Rok powstania≠ | 1976 | 1977 | 1987 |
| Twórca≠ | Donald Rubin | Dempster, Laird & Rubin | Donald B. Rubin |
| Typ≠ | Diagnostic / classification framework | Iterative optimization algorithm | Missing-data handling procedure |
| Źródło pierwotne≠ | Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592. DOI ↗ | Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1), 1–38. DOI ↗ | Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley. DOI ↗ |
| Inne nazwy≠ | Missing Data Typology, Rubin's Missing Data Framework, Missingness Mechanisms, Kayıp Veri Mekanizmaları | EM, Expectation-Maximization, Maximum Likelihood via Incomplete Data, BM Algoritması | MICE, Multivariate Imputation by Chained Equations, Çoklu Atama (Multiple Imputation — MICE) |
| Pokrewne≠ | 3 | 2 | 1 |
| Podsumowanie≠ | Missing data mechanisms, introduced by Donald Rubin in 1976, provide a formal taxonomy for classifying why observations are absent from a dataset. The three categories — Missing Completely At Random (MCAR), Missing At Random (MAR), and Missing Not At Random (MNAR) — describe the relationship between the probability of missingness and the observed or unobserved values. Identifying the correct mechanism is essential because it determines which analytical strategies preserve valid and unbiased inference. | The Expectation-Maximization (EM) algorithm is an iterative optimization procedure for finding maximum likelihood or maximum a posteriori estimates of parameters in statistical models with latent variables or missing data. Introduced by Dempster, Laird, and Rubin in their landmark 1977 paper, EM alternates between computing the expected complete-data log-likelihood (E-step) and maximizing it with respect to the parameters (M-step), guaranteeing monotone non-decreasing likelihood at each iteration. | Multiple Imputation (MI), formally introduced by Donald B. Rubin in 1987, is a principled statistical procedure for handling missing data. Rather than replacing each missing value once, MI fills the gaps m times — each time drawing plausible values from the posterior predictive distribution of the missing data — producing m complete datasets. Each dataset is analysed independently, and the results are combined into a single set of estimates using Rubin's pooling rules. The MICE variant (Multivariate Imputation by Chained Equations), popularised by van Buuren and Groothuis-Oudshoorn (2011), extends the approach to mixed variable types by imputing each variable in turn through a sequence of conditional regression models. |
| ScholarGateZbiór danych ↗ |
|
|
|