ScholarGate
Assistent

Phylogenetic Inference

Phylogenetic inference is the set of methods used to reconstruct evolutionary trees from character data, turning patterns of similarity and difference into hypotheses about ancestry.

Onderwerp vinden met PaperMindBinnenkortFind papers & topics
Tools & resources
Dia's downloaden
Learn & explore
VideoBinnenkort

Definition

Phylogenetic inference is the estimation of evolutionary relationships among taxa from heritable characters, most often molecular sequences. It produces a tree, with branching order and sometimes branch lengths, that best explains the data under an explicit optimality criterion or probabilistic model.

Scope

This topic covers the principal tree-building methods, distance, parsimony, maximum likelihood, and Bayesian inference, the models of sequence evolution they assume, the use of the bootstrap and posterior probabilities to assess support, and the pitfalls such as long-branch attraction that can mislead inference.

Core questions

  • How do distance, parsimony, likelihood, and Bayesian methods differ in inferring trees?
  • What models describe how DNA sequences change along branches?
  • How is confidence in a tree, such as bootstrap support or posterior probability, assessed?
  • What artifacts, like long-branch attraction, can cause incorrect trees?

Key theories

Optimality-based and model-based tree inference
Trees can be chosen by minimizing character changes (parsimony), fitting pairwise distances (distance methods), or maximizing the probability of the data under an explicit substitution model (likelihood and Bayesian methods).
Bootstrap assessment of support
Resampling characters with replacement and rebuilding trees estimates how strongly the data support each clade, providing a standard measure of confidence in inferred relationships.

Mechanisms

Distance methods such as neighbor-joining convert sequence differences into a matrix and build a tree by clustering, offering speed at some loss of information. Parsimony selects the tree requiring the fewest character changes. Maximum likelihood and Bayesian methods adopt explicit models of substitution, accounting for unequal base frequencies, transition-transversion bias, and among-site rate variation, and search for the tree (and parameters) that best explain the data. Support is assessed by the bootstrap for likelihood and parsimony or by posterior probabilities in Bayesian analysis. Long-branch attraction and model misspecification can produce confidently wrong trees, so method choice and model adequacy matter.

Clinical relevance

Phylogenetic inference reconstructs viral and bacterial transmission histories, identifies the source of outbreaks, and dates the emergence of resistant or virulent strains, making it a core tool of genomic epidemiology.

History

Cladistic and distance methods emerged in the 1960s-1970s; Saitou and Nei introduced neighbor-joining in 1987, and Felsenstein pioneered maximum likelihood for sequences and, in 1985, the bootstrap for phylogenies. Bayesian inference and ever-larger genomic datasets have since become standard.

Debates

Parsimony versus model-based methods
A long-running methodological debate concerns whether parsimony or explicit probabilistic models give more reliable trees, especially when rates of change are uneven and long-branch attraction is a risk.

Key figures

  • Joseph Felsenstein
  • Masatoshi Nei
  • Naruya Saitou
  • Willi Hennig

Related topics

Seminal works

  • saitouNei1987
  • felsenstein1985
  • felsensteinBook2004

Frequently asked questions

Which method gives the correct evolutionary tree?
No method is guaranteed correct; model-based methods like maximum likelihood and Bayesian inference are generally favored for sequence data, but all methods can be misled by uneven evolutionary rates and model violations, so support measures are essential.
What does a bootstrap value mean?
A bootstrap value reflects how often a particular grouping recurs when the data are resampled and the tree rebuilt; high values indicate that the grouping is strongly supported by the characters analyzed.

Methods for this concept

Related concepts