Comparer des méthodes
Examinez les méthodes sélectionnées côte à côte ; les lignes qui diffèrent sont mises en évidence.
| Modèle de mélange gaussien× | Analyse en composantes principales× | UMAP× | |
|---|---|---|---|
| Domaine | Apprentissage automatique | Apprentissage automatique | Apprentissage automatique |
| Famille | Machine learning | Machine learning | Machine learning |
| Année d'origine≠ | 1977 | 2002 | 2018 |
| Auteur d'origine≠ | Dempster, Laird & Rubin (EM algorithm) | Jolliffe, I.T. (textbook); Pearson & Hotelling (origins) | McInnes, L.; Healy, J.; Melville, J. |
| Type≠ | Probabilistic (soft) clustering — mixture model | Unsupervised dimensionality reduction | Nonlinear manifold-learning dimension reduction |
| Source fondatrice≠ | Dempster, A.P., Laird, N.M. & Rubin, D.B. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society: Series B, 39(1), 1–22. DOI ↗ | Jolliffe, I.T. (2002). Principal Component Analysis (2nd ed.). Springer. DOI ↗ | McInnes, L., Healy, J. & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:1802.03426. link ↗ |
| Alias≠ | Gaussian Karışım Modeli (GMM Kümeleme), GMM, GMM clustering, mixture of Gaussians | Temel Bileşenler Analizi (PCA), PCA, principal components analysis, Karhunen-Loève transform | UMAP (Uniform Manifold Approximation and Projection), uniform manifold approximation and projection, manifold dimension reduction |
| Apparentées≠ | 4 | 3 | 5 |
| Résumé≠ | A Gaussian Mixture Model is a probabilistic clustering method that models the data as a weighted mixture of several Gaussian distributions, fitted with the Expectation–Maximization algorithm formalized by Dempster, Laird & Rubin in 1977. It is a generalization of K-means in which each cluster can take its own shape, size, and orientation. | Principal Component Analysis (PCA) is an unsupervised dimensionality-reduction method — given its modern textbook treatment by Ian Jolliffe (2002) — that compresses high-dimensional data into fewer dimensions while preserving the maximum possible variance. It re-expresses correlated variables as a small set of uncorrelated principal components ordered by how much of the data's variation each one captures. | UMAP (Uniform Manifold Approximation and Projection) is a fast, scalable nonlinear dimension-reduction method grounded in manifold-learning theory, introduced by McInnes, Healy and Melville in 2018. It compresses high-dimensional data into a low-dimensional embedding for visualisation and downstream analysis. |
| ScholarGateJeu de données ↗ |
|
|
|