ScholarGate
ผู้ช่วย

Data Replication and Consistency

Data replication keeps multiple copies of data for availability and performance, and consistency protocols govern how reads and writes across those copies are reconciled.

ค้นหาหัวข้อด้วย PaperMindเร็ว ๆ นี้Find papers & topics
Tools & resources
ดาวน์โหลดสไลด์
Learn & explore
วิดีโอเร็ว ๆ นี้

Definition

Data replication maintains copies of a data item on several nodes; a consistency model specifies the guarantees about the values that reads may return given the history of writes, ranging from strong (every read sees the latest write) to eventual (replicas converge if updates cease).

Scope

This topic covers replication strategies (primary-backup, multi-master, quorum), quorum-based read/write protocols and their intersection requirements, anti-entropy and gossip for eventual convergence, conflict detection with version vectors and conflict-free replicated data types (CRDTs), and the spectrum of consistency from linearizable to eventual. It treats the data-level counterpart of state-machine replication.

Core questions

  • How do quorum sizes for reads and writes guarantee that reads observe the latest write?
  • How do replicas converge under eventual consistency, and how are conflicts resolved?
  • What consistency level should an application choose given its latency and availability needs?

Key theories

Quorum consensus for replicated data
By assigning votes to replicas and requiring read and write quorums whose sizes sum to more than the total, every read quorum intersects the latest write quorum, guaranteeing that reads observe up-to-date data.
Eventual consistency and anti-entropy
Highly available stores accept writes at any replica and reconcile asynchronously via gossip and version vectors, guaranteeing only that replicas converge when updates stop, as exemplified by the Dynamo design.
Conflict-free replicated data types
CRDTs are data types whose operations are designed to commute or whose states form a join-semilattice, so concurrent updates merge deterministically without coordination, providing strong eventual consistency.

Clinical relevance

These techniques define the guarantees of real storage systems: quorum protocols underlie strongly consistent key-value stores, while eventual consistency and CRDTs power highly available stores, shopping carts, and collaborative editors where availability outranks immediate agreement.

History

Gifford's 1979 weighted-voting scheme established quorum replication; Amazon's 2007 Dynamo paper popularized highly available eventual consistency; and the 2011 formalization of CRDTs gave a principled basis for coordination-free convergence, shaping modern replicated-data design.

Debates

How much consistency should replicated data provide by default?
Strong consistency eases application development but limits availability and adds latency, while eventual consistency maximizes availability at the cost of exposing temporary divergence; tunable quorums and CRDTs are attempts to let applications choose per operation.

Key figures

  • David Gifford
  • Werner Vogels
  • Marc Shapiro
  • Andrew S. Tanenbaum

Related topics

Seminal works

  • gifford1979
  • decandia2007
  • shapiro2011

Frequently asked questions

How do read and write quorums guarantee fresh reads?
If a write must reach W replicas and a read must consult R replicas, and R plus W exceeds the total number of replicas, then any read quorum overlaps the most recent write quorum in at least one replica, so the read can observe the latest value.

Methods for this concept

Related concepts