Why does tolerating f Byzantine faults require 3f+1 processes?

Correct processes must reach a majority decision even when f faulty processes lie and another f may be slow or unreachable; only with more than three times the number of faulty processes can the correct majority always outvote the combination of faulty and lagging participants.

Byzantine Fault Tolerance

Byzantine fault tolerance is the study of reaching agreement when some processes may fail arbitrarily, including by sending conflicting or malicious messages.

Najít téma v PaperMindJiž brzyFind papers & topics

Tools & resources

Stáhnout prezentaci

Learn & explore

VideoJiž brzy

Definition

A Byzantine fault is an arbitrary deviation from a component's specification, including malicious behavior; Byzantine fault tolerance is the ability of a distributed protocol to satisfy its correctness conditions despite a bounded number of such faulty participants.

Scope

This topic covers the Byzantine generals problem and the tight resilience bounds it establishes (at least 3f+1 processes, or signed messages, to tolerate f arbitrary faults), the synchronous oral- and signed-message agreement algorithms, and practical asynchronous Byzantine fault-tolerant protocols such as PBFT and their descendants. It underlies the trust models of permissioned blockchains and high-integrity replicated services.

Core questions

How many correct processes are needed to tolerate a given number of arbitrary faults?
How do digital signatures change the resilience bounds for Byzantine agreement?
How can Byzantine agreement be made efficient enough for practical asynchronous systems?

Key theories

Byzantine generals bound: With unauthenticated (oral) messages, Byzantine agreement is solvable if and only if more than two-thirds of the processes are correct (n > 3f); authenticated messages with unforgeable signatures relax this requirement.
Practical BFT replication: PBFT shows that Byzantine agreement can run in a partially synchronous system with state-machine replication using a primary and a three-phase protocol, achieving practical performance while tolerating up to one-third faulty replicas.
Synchronous agreement algorithms: In the synchronous model, recursive oral-message and signed-message algorithms achieve agreement in f+1 rounds, illustrating the round-complexity cost of tolerating arbitrary faults.

Clinical relevance

Byzantine fault tolerance is the foundation of systems that must withstand not just crashes but compromise or malicious behavior, including permissioned blockchains, high-assurance replicated services, and avionics and aerospace control where arbitrary component failure cannot be ruled out.

History

Pease, Shostak, and Lamport established the agreement bounds in 1980 and dramatized them as the Byzantine generals problem in 1982; for nearly two decades Byzantine agreement was considered too costly for practice until Castro and Liskov's 1999 PBFT demonstrated efficient asynchronous Byzantine replication, reviving the field and later informing blockchain consensus.

Key figures

Leslie Lamport
Robert Shostak
Marshall Pease
Miguel Castro
Barbara Liskov

Seminal works

lamport1982byz
castro1999
pease1980

Frequently asked questions

Why does tolerating f Byzantine faults require 3f+1 processes?: Correct processes must reach a majority decision even when f faulty processes lie and another f may be slow or unreachable; only with more than three times the number of faulty processes can the correct majority always outvote the combination of faulty and lagging participants.