Cache Coherence Protocols
Cache coherence protocols keep the multiple private caches in a shared-memory multiprocessor consistent, ensuring that when one core writes a memory location, other cores do not continue to read stale cached copies.
Definition
A cache coherence protocol is a mechanism that maintains a consistent view of shared memory across the private caches of multiple processors, by tracking the state of each cached block and coordinating reads and writes so that a write becomes visible and old copies are not used.
Scope
This topic covers the coherence problem and its solutions: invalidation versus update protocols, snooping protocols on a shared bus, directory-based protocols for scalable systems, and the canonical state machines such as MSI and MESI. It treats how hardware preserves a coherent view of memory across caches. It excludes the ordering of operations across addresses (memory consistency, under shared-memory-and-coherence) and single-cache policies (cache organization and policies).
Core questions
- What does it mean for caches to be coherent, and why is coherence needed?
- How do invalidation and update protocols differ?
- How do snooping protocols use a shared interconnect to maintain coherence?
- How do directory protocols scale coherence beyond a shared bus?
Key concepts
- coherence problem
- invalidation vs update protocols
- snooping protocols
- directory-based protocols
- MSI and MESI states
- shared bus and interconnect
- false sharing
- write serialization
Key theories
- State-based coherence (MSI/MESI)
- Each cached block is tracked in a small set of states (such as Modified, Exclusive, Shared, Invalid); reads and writes trigger state transitions and coherence messages that ensure at most one writer and that readers never see stale data.
Mechanisms
Each cache block carries a coherence state. In snooping protocols every cache observes (snoops) bus transactions and updates or invalidates its copies accordingly; in directory protocols a directory records which caches hold each block and sends targeted coherence messages. A write to a shared block first invalidates other copies (in an invalidation protocol), granting the writer exclusive ownership before it modifies the data.
Clinical relevance
Coherence makes shared-memory multiprocessing programmable: software can treat memory as a single shared store while hardware hides the existence of multiple caches. Coherence traffic and effects such as false sharing, however, can dominate performance in parallel programs, so awareness of coherence is important for scalable multithreaded software.
History
Coherence protocols emerged with bus-based multiprocessors in the 1980s, where snooping schemes such as write-invalidate and the MESI family became standard. As systems scaled past a single bus, directory-based protocols were developed for distributed shared memory, and coherence remains a central design problem in multicore processors.
Debates
- Snooping versus directory coherence
- Snooping is simple and fast on small shared-bus systems but does not scale, while directories scale to many cores at the cost of complexity and storage; large designs often combine hierarchical or hybrid approaches.
Key figures
- David E. Culler
- Mark D. Hill
- James R. Goodman
- John L. Hennessy
Related topics
Seminal works
- hennessy2019
- culler1999
Frequently asked questions
- What is the difference between coherence and consistency?
- Coherence concerns a single memory location: all caches must eventually agree on its value and see writes to it in order. Consistency (the memory consistency model) concerns the ordering of operations across different locations as observed by different processors. Coherence is necessary but not sufficient for a well-defined consistency model.
- What is false sharing?
- False sharing occurs when independent variables used by different cores happen to lie in the same cache block. Because coherence operates at block granularity, writes by one core invalidate the block in others even though they touch different variables, causing needless coherence traffic and slowdown.