Methoden vergelijken

Bekijk de geselecteerde methoden naast elkaar; rijen die verschillen zijn gemarkeerd.

	Elbow-methode ×	Calinski-Harabasz Index ×	Gap Statistic ×
Vakgebied	Modelevaluatie	Modelevaluatie	Modelevaluatie
Familie	MCDM	MCDM	MCDM
Jaar van ontstaan≠	1953	1974	2001
Grondlegger≠	Robert Thorndike	Tadeusz Calinski, Jerzy Harabasz	Robert Tibshirani, Guenther Walther, Trevor Hastie
Type≠	Heuristic optimization criterion	Cluster quality metric	Statistical criterion
Oorspronkelijke bron≠	Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. link ↗	Calinski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3(1), 1-27. DOI ↗	Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411-423. DOI ↗
Aliassen≠	elbow analysis, knee detection	variance ratio criterion, pseudo F-statistic, CH index	gap index, Tibshirani gap statistic
Verwant	5	5	5
Samenvatting≠	The Elbow Method is a heuristic for selecting the optimal number of clusters in partitional clustering. Introduced by Robert Thorndike in 1953, it involves fitting clustering models for increasing numbers of clusters and plotting the within-cluster sum of squares (WCSS) against the number of clusters. The 'elbow' occurs where the rate of WCSS decrease sharply changes, suggesting an optimal cluster count.	The Calinski-Harabasz Index, also called the Variance Ratio Criterion, was introduced by Calinski and Harabasz in 1974. It is a metric that measures the ratio of between-cluster variance to within-cluster variance, adjusted for the number of clusters and data points. Higher values indicate better-separated, more compact clusters.	The Gap Statistic, developed by Tibshirani, Walther, and Hastie in 2001, is a principled statistical method for determining the optimal number of clusters in a dataset. It compares the observed within-cluster sum of squares to the expected value under a null hypothesis of no clustering structure, providing a theoretically grounded approach to cluster number selection.
ScholarGateGegevensset ↗	v1 2 Bronnen PUBLISHED	v1 1 Bronnen PUBLISHED	v1 1 Bronnen PUBLISHED

Naar zoeken → Dia's downloaden