Why can't a GWAS test rare variants one at a time?

A variant carried by only a few individuals provides too little statistical information for a reliable single-marker test, so rare-variant methods aggregate many variants - usually within a gene - to gain power.

How does SKAT differ from a simple burden test?

A burden test assumes the aggregated variants act mostly in the same direction, whereas SKAT is a variance-component test that detects departures from the null even when variant effects differ in direction or magnitude, making it more robust to heterogeneous effects.

Rare Variant Discovery and Burden Testing

Standard GWAS are powered to detect common variants, but much of the genome's functional variation is rare. Because any single rare variant appears in too few people to test reliably one at a time, rare-variant analysis instead aggregates variants - typically within a gene - and tests whether their combined burden differs between cases and controls. Sequencing made these variants observable, and methods such as burden and kernel tests made them statistically tractable.

Pronađite temu uz PaperMindUskoroFind papers & topics

Tools & resources

Preuzmi slajdove

Learn & explore

VideoUskoro

Definition

Rare-variant discovery is the identification, usually by sequencing, of low-frequency genetic variants associated with a trait, and burden testing is a family of gene- or region-based methods that aggregate multiple rare variants into a single test to gain power that single-marker analysis lacks.

Scope

This topic covers why rare variants escape conventional single-marker GWAS, the sequencing technologies and reference panels that reveal them, and the main aggregation strategies - simple burden (collapsing) tests, variance-component kernel tests such as SKAT, and combined or optimal tests such as SKAT-O. It also notes the role of variant annotation in deciding which variants to aggregate. It is a methods reference, not clinical guidance.

Core questions

Why does conventional single-marker GWAS lack power for rare variants?
How does sequencing, rather than array genotyping, reveal rare variation?
How do burden (collapsing) tests aggregate rare variants within a gene?
How do kernel-based tests such as SKAT differ from simple burden tests?
When are variants assumed to act in the same direction, and what happens when they do not?

Key concepts

Rare and low-frequency variants
Whole-exome and whole-genome sequencing
Gene- or region-based aggregation
Burden / collapsing tests
Sequence Kernel Association Test (SKAT)
Combined and optimal tests (SKAT-O)
Functional annotation and variant weighting

Mechanisms

Single-marker association testing loses power when a variant is carried by only a handful of individuals, so rare-variant methods aggregate variants across a gene or region. Burden (collapsing) tests summarise the rare variants in a unit into a single count or indicator and test whether that burden differs between cases and controls; they are powerful when most variants affect the trait in the same direction but lose power when effects are mixed in direction or many variants are neutral. Variance-component kernel tests, exemplified by the Sequence Kernel Association Test (SKAT), instead test whether the distribution of variant effects departs from the null without assuming a common direction, and remain powerful when effects are heterogeneous. Combined approaches such as SKAT-O adaptively blend burden and kernel tests to perform well across scenarios. Because aggregation depends on choosing which variants to include, functional annotation and frequency-based weighting are used to focus on plausibly deleterious variants. Sequencing and diverse reference panels such as the 1000 Genomes Project underpin the discovery and annotation of the rare variation these tests analyse.

Clinical relevance

Rare-variant methods extend genetic discovery toward variation more likely to be functional and closer to underlying biology, complementing common-variant GWAS. This topic describes analytic methods and is not a basis for individual variant interpretation, diagnosis, or treatment decisions.

Evidence & guidelines

The methodological basis comes from statistical-genetics literature rather than clinical guidelines. Wu et al. (2011) introduced SKAT for sequencing data; Lee et al. (2012) developed the optimal combined test (SKAT-O); the 1000 Genomes Project (2015) provided reference data for rare variation; and Manolio et al. (2009) framed rare variants as one candidate source of heritability not captured by common-variant GWAS.

History

As common-variant GWAS matured and left heritability unexplained, attention turned to rare variants that arrays could not capture. The spread of affordable exome and genome sequencing around 2010 made rare variation observable at scale, and a wave of aggregation methods followed: simple collapsing tests, then variance-component kernel tests such as SKAT in 2011, and adaptive combinations such as SKAT-O in 2012. Large sequencing consortia and biobank exome studies have since applied these methods broadly, though detecting rare-variant signals still demands very large samples.

Debates

When should burden tests be preferred over kernel tests?: Burden tests are most powerful when aggregated variants act in a consistent direction, while kernel tests such as SKAT are more robust to mixed effect directions and many neutral variants; combined tests aim to hedge, but the right choice depends on the unknown true architecture of the gene.

Key figures

Xihong Lin
Michael Wu
Seunggeun Lee
Michael Boehnke
Teri Manolio

Seminal works

wu-2011
lee-2012
manolio-2009

Frequently asked questions

Why can't a GWAS test rare variants one at a time?: A variant carried by only a few individuals provides too little statistical information for a reliable single-marker test, so rare-variant methods aggregate many variants - usually within a gene - to gain power.
How does SKAT differ from a simple burden test?: A burden test assumes the aggregated variants act mostly in the same direction, whereas SKAT is a variance-component test that detects departures from the null even when variant effects differ in direction or magnitude, making it more robust to heterogeneous effects.