ScholarGate
עוזר

Distributed Commit and Consensus

Distributed commit and consensus protocols let multiple database nodes agree atomically on the outcome of a transaction and on a consistent order of operations, even when nodes and networks can fail.

מציאת נושא עם PaperMindבקרובFind papers & topics
Tools & resources
הורדת מצגת
Learn & explore
וידאובקרוב

Definition

Atomic commitment is the problem of ensuring that all participating nodes in a distributed transaction agree to either commit or abort it; consensus is the more general problem of having distributed nodes agree on a single value or ordered log despite failures, used to keep database replicas consistent.

Scope

This topic covers atomic commitment of transactions that span sites — the two-phase commit protocol, its blocking weakness, and the non-blocking three-phase commit — and the consensus protocols (Paxos, Raft) used to keep replicated state machines and commit logs consistent. It treats these mechanisms in their database role of guaranteeing atomicity and replica agreement. It defers the general distributed-systems theory of consensus, failure models, and impossibility results to distributed-computing topics, citing rather than duplicating them.

Core questions

  • How does two-phase commit achieve atomic commitment across nodes?
  • Why can two-phase commit block, and how does three-phase commit address it?
  • How do consensus protocols such as Paxos and Raft keep replicas in agreement?
  • How does consensus relate to atomic commitment in replicated databases?
  • How do failures and network partitions affect the guarantees these protocols provide?

Key concepts

  • atomic commitment
  • two-phase commit (2PC)
  • coordinator and participants
  • blocking and the coordinator-failure problem
  • three-phase commit (3PC)
  • Paxos
  • Raft
  • replicated log / state machine replication

Key theories

Two-phase commit
A coordinator asks all participants to prepare (vote) and then, only if all vote yes, tells them to commit; the protocol guarantees atomicity but blocks if the coordinator fails while participants are uncertain.
Non-blocking commitment
Three-phase commit adds a pre-commit phase so that participants can reach a consistent decision even if the coordinator fails, removing the blocking behavior of two-phase commit under certain failure assumptions.
Consensus for replicated state
Paxos and the more understandable Raft let a set of replicas agree on an ordered log of operations despite crashes, providing the fault-tolerant agreement that modern databases use to replicate commit decisions and configuration.

Clinical relevance

Distributed commit and consensus are what make multi-node databases reliable: they ensure that a transaction touching several shards either commits everywhere or nowhere, and that replicated systems do not diverge after failures, which is essential for correctness in financial and globally distributed data systems.

History

Two-phase commit was the early standard for distributed atomicity; Skeen and Stonebraker analyzed its blocking behavior in 1983 and motivated three-phase commit. Lamport's Paxos (published 1998) provided fault-tolerant consensus, and Raft (Ongaro and Ousterhout, 2014) offered a more understandable alternative; both now underpin replication in widely used distributed databases.

Debates

Atomic commitment versus consensus for replicated transactions
Two-phase commit guarantees atomicity but blocks on coordinator failure, while consensus-based replication tolerates failures at higher cost; system designers debate how to combine commit and consensus to get both atomicity and availability across replicated, partitioned data.

Key figures

  • Leslie Lamport
  • Dale Skeen
  • Diego Ongaro
  • John Ousterhout

Related topics

Seminal works

  • lamport1998
  • skeen1983
  • ongaro2014

Frequently asked questions

Why does two-phase commit block, and is that a problem?
If the coordinator fails after participants have voted to commit but before they learn the decision, those participants are stuck holding locks and cannot safely commit or abort on their own. This blocking ties up resources until the coordinator recovers, which is why three-phase commit and consensus-based approaches were developed to reduce or remove it.
How is consensus different from atomic commitment?
Atomic commitment requires unanimous agreement to commit — any single no-vote or failure can force an abort. Consensus requires only a majority to agree on a value and can make progress despite a minority of failed nodes. Databases use atomic commitment for cross-shard transactions and consensus for keeping replicas of each shard consistent.

Methods for this concept

Related concepts