ScholarGate
المساعد

Population Stratification and Admixture

Population stratification is the presence of systematic differences in genetic ancestry between the compared groups of a genetic study, and admixture is the mixing of ancestries within individuals from previously separated populations. Both create population structure that can confound genetic association studies, generating spurious links between a variant and a disease simply because allele frequency and disease risk both differ by ancestry.

اعثر على موضوع باستخدام PaperMindقريبًاFind papers & topics
Tools & resources
تنزيل الشرائح
Learn & explore
فيديوقريبًا

Definition

Population stratification is confounding in a genetic association study caused by ancestry differences between groups, in which allele frequencies and disease risk both vary across subpopulations; admixture is the presence within individuals of genetic ancestry from two or more historically distinct populations, a related source of structure.

Scope

The topic covers how population structure arises, why it confounds case-control genetic association studies, and the main methods used to detect and adjust for it. It is presented as a methodological subject in genetic epidemiology — concerned with study validity — and not as a statement about the biology or ranking of human population groups.

Core questions

  • Are the compared groups in a genetic study drawn from the same underlying population?
  • Could an apparent variant-disease association be explained by ancestry rather than causation?
  • How can population structure be detected from genetic data?
  • How can association tests be adjusted so that structure does not inflate false positives?

Key concepts

  • Confounding by ancestry
  • Population structure and substructure
  • Admixture
  • Allele frequency differences
  • Genomic control
  • Principal components analysis of ancestry
  • Mixed models for relatedness

Mechanisms

If the cases and controls in an association study differ in ancestry, any variant whose frequency differs between those ancestral groups will appear associated with disease whenever disease risk also differs between the groups, even when the variant has no causal role. This is classical confounding, with genetic ancestry as the confounder. Methods address it by measuring and adjusting for ancestry: genomic control rescales test statistics using an inflation factor estimated from many markers; principal components analysis summarises ancestry from genome-wide genotypes and includes those components as covariates; and mixed models account for both broad structure and cryptic relatedness. Admixture, where individuals carry mixed ancestry, can be handled with related approaches that estimate local or global ancestry.

Clinical relevance

Controlling for population structure is essential to the validity of the genetic association evidence that informs understanding of chronic disease risk, since uncontrolled stratification can produce false associations that mislead later research. As a reference topic this entry explains a threat to study validity and how it is addressed; it does not provide guidance for individual genetic testing or interpretation.

Epidemiology

Concern about stratification grew as genetic association studies scaled up, because even modest ancestry differences between cases and controls can inflate false-positive rates across the many variants tested in a genome-wide study. The development of genomic control and, subsequently, principal-components and mixed-model adjustment made large multi-ancestry association studies feasible while keeping false-positive rates controlled.

History

Awareness that ancestry could confound association studies predates the genomic era, but practical solutions emerged in the late 1990s and 2000s. Pritchard and Rosenberg proposed using unlinked markers to detect stratification, Devlin and Roeder introduced genomic control to correct inflated test statistics, and Price and colleagues showed in 2006 that principal components analysis could efficiently correct for stratification in genome-wide association studies, an approach that became standard practice.

Debates

How completely can statistical adjustment remove confounding by ancestry?
Genomic control, principal components, and mixed models reduce inflation from population structure, but debate continues over residual confounding from fine-scale or recent structure and over how well these corrections transfer across diverse and admixed populations.

Key figures

  • Jonathan Pritchard
  • Noah Rosenberg
  • Bernie Devlin
  • Kathryn Roeder
  • Alkes Price
  • David Reich

Related topics

Seminal works

  • pritchard-rosenberg-1999
  • devlin-roeder-1999
  • price-2006

Frequently asked questions

Why does population stratification cause false associations?
When cases and controls differ in ancestry, variants that simply differ in frequency between ancestral groups can look associated with disease whenever disease risk also differs by ancestry, so the association reflects confounding by ancestry rather than a causal effect of the variant.
How do modern studies correct for population structure?
Common approaches estimate ancestry from genome-wide data and adjust for it, for example by including principal components of ancestry as covariates, applying genomic control to rescale test statistics, or using mixed models that account for relatedness and structure.

Methods for this concept

Related concepts