What is the difference between global and local alignment?

Global alignment (Needleman-Wunsch) aligns two sequences end to end and is appropriate when they are of similar length and expected to be related throughout. Local alignment (Smith-Waterman) finds the best-matching subregion and is appropriate when only parts of the sequences are similar, such as a conserved domain within larger sequences.

Is sequence alignment the same as the longest common subsequence?

They are closely related. The longest common subsequence is a special case of alignment with particular scoring (matches rewarded, gaps free, mismatches disallowed). General alignment uses richer scoring with substitution costs and gap penalties, but both are solved by the same kind of dynamic-programming recurrence.

Sequence Alignment Algorithms

Sequence alignment algorithms measure and display the similarity between two strings by finding the lowest-cost way to transform one into the other, using dynamic programming over insertions, deletions and substitutions.

Atrast tematu ar PaperMindDrīzumāFind papers & topics

Tools & resources

Lejupielādēt slaidus

Learn & explore

VideoDrīzumā

Definition

Sequence alignment is the arrangement of two (or more) strings to maximize their correspondence by inserting gaps and matching, mismatching, or substituting characters; an alignment algorithm computes an optimal alignment and its score under a given cost or scoring model, typically via dynamic programming.

Scope

This topic covers the dynamic-programming algorithms for comparing strings: edit (Levenshtein) distance and the longest common subsequence, global alignment (Needleman-Wunsch), local alignment (Smith-Waterman), and the scoring schemes (gap penalties, substitution matrices) that shape biological alignments. It treats the underlying recurrence and its quadratic time and space cost, plus space-saving refinements. Index-based and exact-matching methods are covered in related topics.

Core questions

How is similarity between two strings quantified as an edit distance or alignment score?
How does a dynamic-programming table compute an optimal alignment?
What distinguishes global alignment from local alignment, and when is each appropriate?
How do gap penalties and substitution matrices influence the resulting alignment?
How can the quadratic space requirement be reduced for long sequences?

Key concepts

edit (Levenshtein) distance
longest common subsequence
global alignment
local alignment
Needleman-Wunsch algorithm
Smith-Waterman algorithm
gap penalties
substitution matrices

Key theories

Global alignment by dynamic programming: The Needleman-Wunsch algorithm fills a dynamic-programming table whose entries give the best alignment score of string prefixes, yielding an optimal end-to-end (global) alignment of two sequences in time proportional to the product of their lengths.
Local alignment for shared subsequences: The Smith-Waterman algorithm modifies the recurrence to allow alignments to start and end anywhere, finding the highest-scoring matching region (local alignment), which is essential for detecting conserved fragments within otherwise dissimilar sequences.

Clinical relevance

Sequence alignment is a cornerstone of computational biology, used to compare DNA, RNA and protein sequences for homology, evolutionary inference, and database search (the heuristic BLAST builds on local alignment). Outside biology, the same edit-distance machinery powers spell checking, fuzzy text matching, version-control diffs, and speech and handwriting comparison.

History

Levenshtein introduced edit distance in 1965. Needleman and Wunsch gave the first dynamic-programming global-alignment algorithm for proteins in 1970, and Smith and Waterman introduced local alignment in 1981. Hirschberg's linear-space technique (1975) and later heuristics such as BLAST extended alignment to large-scale practical use.

Key figures

Saul Needleman
Christian Wunsch
Temple Smith
Michael Waterman
Vladimir Levenshtein

Seminal works

needleman1970
smith1981
gusfield1997

Frequently asked questions

What is the difference between global and local alignment?: Global alignment (Needleman-Wunsch) aligns two sequences end to end and is appropriate when they are of similar length and expected to be related throughout. Local alignment (Smith-Waterman) finds the best-matching subregion and is appropriate when only parts of the sequences are similar, such as a conserved domain within larger sequences.
Is sequence alignment the same as the longest common subsequence?: They are closely related. The longest common subsequence is a special case of alignment with particular scoring (matches rewarded, gaps free, mismatches disallowed). General alignment uses richer scoring with substitution costs and gap penalties, but both are solved by the same kind of dynamic-programming recurrence.