Hierarchical Cluster Analysis
Hierarchical cluster analysis builds a nested sequence of clusters, visualized as a dendrogram, by successively merging or splitting groups according to a linkage criterion.
Definition
Hierarchical cluster analysis is a clustering approach that produces a tree of nested partitions by iteratively combining the most similar clusters, or splitting the least cohesive ones, according to a chosen between-cluster distance.
Scope
This topic covers agglomerative (bottom-up) and divisive (top-down) hierarchical clustering, the common linkage rules such as single, complete, average, and Ward's minimum-variance linkage, the construction and interpretation of the dendrogram, and the cutting of the tree to obtain a flat partition.
Core questions
- How can a nested family of clusterings be constructed from pairwise dissimilarities?
- How do different linkage rules shape the resulting clusters?
- How is the dendrogram read and where should it be cut?
- When is a hierarchical structure more informative than a single flat partition?
Key theories
- Linkage-defined merging
- Agglomerative clustering repeatedly merges the two clusters that are closest under a linkage definition; single, complete, average, and Ward linkages encode different notions of between-cluster distance and produce characteristically different cluster shapes.
- Dendrogram representation
- The sequence of merges is encoded as a dendrogram whose merge heights record dissimilarity, allowing any number of clusters to be obtained by cutting the tree at a chosen height.
Clinical relevance
Hierarchical clustering is widely used where a nested grouping is natural or informative, such as constructing taxonomies, organizing gene-expression heatmaps, and exploring document or organism similarity.
History
Hierarchical grouping methods were formalized in the early 1960s, including Ward's minimum-variance criterion, and became staples of numerical taxonomy and exploratory data analysis as computing made dendrogram construction routine.
Debates
- Choice of linkage
- Single linkage can chain clusters together while complete linkage tends to produce compact groups, and Ward's method favors equal-sized spherical clusters, so the linkage choice strongly shapes results and is rarely uniquely correct.
Key figures
- Joe Ward
- Peter Rousseeuw
Related topics
Seminal works
- everitt2011
- kaufman1990
- wardjr1963
Frequently asked questions
- What is the difference between agglomerative and divisive clustering?
- Agglomerative clustering starts with each object as its own cluster and merges upward, while divisive clustering starts with one cluster and splits downward; agglomerative methods are far more common in practice.
- How do I choose the number of clusters from a dendrogram?
- By cutting the tree at a chosen height, often where merge heights jump sharply, which corresponds to combining groups that are much less similar than those merged below.