What is the difference between conservation and constraint?

Conservation is measured across species, from how little a sequence has changed over evolutionary time; constraint is measured within a species, from how much variation is missing relative to neutral expectation. Both reflect purifying selection but use different data.

Does a high constraint score mean a variant is pathogenic?

No. Constraint and conservation are statistical signals of functional importance averaged over positions or genes; they can support variant prioritisation but do not by themselves establish that any individual variant causes disease.

Evolutionary Conservation and Constraint Metrics

Sequence that has changed little across species, or that carries fewer variants than expected within a species, is said to be conserved or constrained — a signature that purifying selection has removed deleterious changes. Conservation and constraint metrics turn this evolutionary signal into quantitative scores that flag functionally important positions in the genome.

Definition

Evolutionary conservation is the persistence of a sequence across species through purifying selection; constraint is the corresponding depletion of variation within a species. Conservation and constraint metrics are scores that quantify, position by position or gene by gene, how strongly selection has acted against change.

Scope

The entry covers cross-species conservation scores and within-species constraint metrics, the evolutionary logic that connects them to function, and their role in prioritising variants. It is a methodological topic describing how scores are derived and interpreted, not a tool for assigning clinical significance to a particular variant.

Core questions

Why does conservation across species indicate biological function?
How are per-base conservation scores computed from multiple alignments?
How is constraint measured from the deficit of variation within a species?
How are these metrics used to prioritise candidate variants?

Key concepts

Purifying (negative) selection
Cross-species conservation scores
phastCons and phyloP
Within-species constraint
Loss-of-function intolerance
Observed vs expected variant counts

Key theories

Purifying selection and reduced substitution rate: Functionally important sites experience purifying (negative) selection that removes deleterious mutations, lowering their substitution rate between species and depleting segregating variation within a species; conservation and constraint scores read this reduced rate or depletion as evidence of function.

Mechanisms

Conservation is inferred by comparing homologous sequences across species in a multiple alignment and asking where substitutions are rarer than a neutral rate would predict; methods such as phastCons and phyloP formalise this on a phylogeny. Constraint instead uses within-species data, comparing the number of variants actually observed in a gene or region with the number expected under a neutral mutational model — a substantial deficit indicates intolerance to variation, especially to loss-of-function changes. Large human sequencing datasets have made it possible to quantify this constraint genome-wide.

Clinical relevance

Conservation and constraint metrics are widely used as supporting evidence when prioritising candidate variants in research and diagnostic genomics, because variants at constrained positions or in constrained genes are, on average, more likely to be functionally consequential. These scores are population- and evolution-level summaries; this entry describes how they are derived and is not a basis for individual diagnostic or treatment decisions.

Evidence & guidelines

Cross-species conservation scoring was established by phylogenetic methods such as phastCons and phyloP, while large-scale human sequencing efforts quantified within-species constraint and loss-of-function intolerance across genes, providing the empirical basis for constraint metrics now used in variant prioritisation.

History

Comparative sequence analysis long suggested that conserved regions were functional, and the availability of many aligned genomes in the 2000s allowed quantitative conservation scores to be computed genome-wide. With very large human cohorts in the 2010s, attention extended to within-species constraint, yielding gene-level intolerance metrics and a genome-wide map of mutational constraint.

Key figures

Adam Siepel
David Haussler
Katherine Pollard
Daniel MacArthur

Seminal works

siepel-2005
lek-2016
karczewski-2020

Frequently asked questions

What is the difference between conservation and constraint?: Conservation is measured across species, from how little a sequence has changed over evolutionary time; constraint is measured within a species, from how much variation is missing relative to neutral expectation. Both reflect purifying selection but use different data.
Does a high constraint score mean a variant is pathogenic?: No. Constraint and conservation are statistical signals of functional importance averaged over positions or genes; they can support variant prioritisation but do not by themselves establish that any individual variant causes disease.