Haplotype Blocks and Population-Level Structural Organization
A haplotype is a set of variants that sit together on a single chromosome and tend to be inherited as a unit. Across the genome these variants are not arranged at random: stretches of strong correlation, called haplotype blocks, alternate with shorter regions where recombination has shuffled the combinations. This block-like structure is the population-level organization of genetic variation, and it underpins how variants are tagged, imputed, and mapped to traits.
Definition
A haplotype is a combination of alleles at linked loci inherited together on one chromosome; haplotype blocks are genomic segments within which a few common haplotypes account for most chromosomes, reflecting strong linkage disequilibrium, and they are bounded by sites of historical recombination.
Scope
This topic covers haplotypes and linkage disequilibrium, the empirical observation that the genome is organized into blocks of limited haplotype diversity separated by recombination-rich boundaries, and how this structure differs among populations. It treats haplotype structure as a population-genomic concept relevant to mapping and reference resources; it does not provide clinical or ancestry-interpretation guidance for individuals.
Core questions
- What is a haplotype, and what is linkage disequilibrium?
- Why is genetic variation organized into blocks of limited haplotype diversity?
- How do recombination hotspots define block boundaries?
- How does haplotype structure differ among human populations, and why does it matter for mapping?
Key concepts
- Haplotype
- Linkage disequilibrium
- Haplotype block
- Tag SNP
- Recombination hotspot
- Genotype imputation
- Population-specific haplotype structure
Key theories
- Haplotype-block structure of the genome
- The genome is organized into segments of strong linkage disequilibrium within which a small number of common haplotypes predominate, separated by short regions of historical recombination, so that a few tag variants can capture most of the common variation in a block.
Mechanisms
Linkage disequilibrium—the non-random association of alleles at nearby sites—arises because variants on the same chromosome are co-inherited until recombination separates them. Recombination is concentrated at hotspots, so segments between hotspots accumulate few historical crossovers and retain a small set of common haplotypes, producing the block structure observed empirically. Within a block, a handful of tag variants can stand in for the rest, which is the basis for genotype imputation against a reference panel. Because recombination history, drift, and demography differ among populations, block boundaries and haplotype frequencies are population-specific.
Clinical relevance
Haplotype structure is what makes association mapping and imputation feasible, since a typed tag variant can represent nearby untyped variants in the health sciences research that links genetic variation to traits. This entry describes haplotype organization as a population-genomic reference concept and is not a basis for individual diagnosis, treatment, or ancestry interpretation.
Epidemiology
Empirical surveys, beginning with Gabriel and colleagues, showed that much of the genome falls into haplotype blocks within which a few common haplotypes cover most chromosomes, with block sizes and boundaries varying by population and generally smaller in populations of African ancestry, reflecting deeper recombination history. The International HapMap and 1000 Genomes resources catalogued this structure genome-wide across multiple ancestral groups, providing reference panels for tagging and imputation.
History
The idea that variation is correlated along the chromosome predates genomics, but its genome-wide structure became measurable only with dense variant maps. Gabriel and colleagues described haplotype blocks in 2002, the International HapMap Project then catalogued haplotype structure across populations, and the 1000 Genomes Project extended reference haplotypes to whole-genome sequence, together establishing the resources that make tagging and imputation routine.
Debates
- How discrete are haplotype blocks?
- Blocks are a useful description, but linkage disequilibrium decays continuously and block definitions depend on the metric and threshold used, so whether the genome is truly partitioned into discrete blocks or shows a continuum of correlation remains a matter of interpretation.
Key figures
- Stacey B. Gabriel
- David Altshuler
- Montgomery Slatkin
- Eric S. Lander
- Kelly A. Frazer
Related topics
Seminal works
- gabriel-2002
- hapmap-2007
- slatkin-2008
Frequently asked questions
- What is the difference between a haplotype and a genotype?
- A genotype lists the alleles a person carries at a site without specifying which chromosome each is on, whereas a haplotype specifies the combination of alleles that travel together on a single chromosome.
- Why does haplotype structure differ between populations?
- Populations differ in their recombination history, genetic drift, and demography, so the size of haplotype blocks and the frequencies of particular haplotypes vary; blocks are generally shorter in populations with deeper ancestral histories, such as those of African ancestry.