ScholarGate
Asistent

Cache Coherence Protocols

Cache coherence protocols keep the multiple private caches in a shared-memory multiprocessor consistent, ensuring that when one core writes a memory location, other cores do not continue to read stale cached copies.

Pronađite temu uz PaperMindUskoroFind papers & topics
Tools & resources
Preuzmi slajdove
Learn & explore
VideoUskoro

Definition

A cache coherence protocol is a mechanism that maintains a consistent view of shared memory across the private caches of multiple processors, by tracking the state of each cached block and coordinating reads and writes so that a write becomes visible and old copies are not used.

Scope

This topic covers the coherence problem and its solutions: invalidation versus update protocols, snooping protocols on a shared bus, directory-based protocols for scalable systems, and the canonical state machines such as MSI and MESI. It treats how hardware preserves a coherent view of memory across caches. It excludes the ordering of operations across addresses (memory consistency, under shared-memory-and-coherence) and single-cache policies (cache organization and policies).

Core questions

  • What does it mean for caches to be coherent, and why is coherence needed?
  • How do invalidation and update protocols differ?
  • How do snooping protocols use a shared interconnect to maintain coherence?
  • How do directory protocols scale coherence beyond a shared bus?

Key concepts

  • coherence problem
  • invalidation vs update protocols
  • snooping protocols
  • directory-based protocols
  • MSI and MESI states
  • shared bus and interconnect
  • false sharing
  • write serialization

Key theories

State-based coherence (MSI/MESI)
Each cached block is tracked in a small set of states (such as Modified, Exclusive, Shared, Invalid); reads and writes trigger state transitions and coherence messages that ensure at most one writer and that readers never see stale data.

Mechanisms

Each cache block carries a coherence state. In snooping protocols every cache observes (snoops) bus transactions and updates or invalidates its copies accordingly; in directory protocols a directory records which caches hold each block and sends targeted coherence messages. A write to a shared block first invalidates other copies (in an invalidation protocol), granting the writer exclusive ownership before it modifies the data.

Clinical relevance

Coherence makes shared-memory multiprocessing programmable: software can treat memory as a single shared store while hardware hides the existence of multiple caches. Coherence traffic and effects such as false sharing, however, can dominate performance in parallel programs, so awareness of coherence is important for scalable multithreaded software.

History

Coherence protocols emerged with bus-based multiprocessors in the 1980s, where snooping schemes such as write-invalidate and the MESI family became standard. As systems scaled past a single bus, directory-based protocols were developed for distributed shared memory, and coherence remains a central design problem in multicore processors.

Debates

Snooping versus directory coherence
Snooping is simple and fast on small shared-bus systems but does not scale, while directories scale to many cores at the cost of complexity and storage; large designs often combine hierarchical or hybrid approaches.

Key figures

  • David E. Culler
  • Mark D. Hill
  • James R. Goodman
  • John L. Hennessy

Related topics

Seminal works

  • hennessy2019
  • culler1999

Frequently asked questions

What is the difference between coherence and consistency?
Coherence concerns a single memory location: all caches must eventually agree on its value and see writes to it in order. Consistency (the memory consistency model) concerns the ordering of operations across different locations as observed by different processors. Coherence is necessary but not sufficient for a well-defined consistency model.
What is false sharing?
False sharing occurs when independent variables used by different cores happen to lie in the same cache block. Because coherence operates at block granularity, writes by one core invalidate the block in others even though they touch different variables, causing needless coherence traffic and slowdown.

Methods for this concept

Related concepts