Shared Memory and Data Parallelism
Shared-memory parallelism has multiple threads operate on a common address space, while data parallelism applies the same operation across many data elements at once.
Definition
In the shared-memory model, parallel threads of execution communicate implicitly by reading and writing a common memory, coordinating through synchronization primitives; data parallelism is the special case in which the same computation is applied independently across the elements of a data structure.
Scope
This topic covers the shared-address-space model and its programming interfaces (threads, OpenMP), the synchronization primitives that coordinate threads (locks, barriers, atomics) and the hazards they manage (data races, deadlock, false sharing), data-parallel and loop-parallel constructs, and task-based runtimes with work-stealing schedulers. It treats the programming side whose theoretical foundations appear under shared-memory models.
Core questions
- How do threads coordinate access to shared data without races or deadlock?
- How can loops and array operations be expressed for parallel execution?
- How are dynamically generated tasks balanced across cores efficiently?
Key theories
- Synchronization and memory consistency
- Correct shared-memory programs rely on synchronization primitives and an understanding of the memory model that governs when one thread's writes become visible to another, with mismanagement causing races or deadlock.
- Directive-based data parallelism
- OpenMP lets programmers annotate sequential code with directives that parallelize loops and regions over a shared address space, providing a portable, incremental path to shared-memory parallelism.
- Work-stealing task scheduling
- Task-based runtimes schedule dynamically created tasks by having idle processors steal work from busy ones, achieving provably good load balance and bounded overhead for irregular parallel computations.
Clinical relevance
Shared-memory and data-parallel programming is how everyday software exploits multicore CPUs and how scientific kernels, simulations, and array-heavy workloads achieve speedups on a single node, making it the most widely encountered form of parallel programming.
History
Shared-memory multiprocessing matured with the threads model and the 1998 OpenMP standard for portable directive-based parallelism; work-stealing schedulers from the Cilk project (1996 onward) provided efficient task-based parallelism, and these ideas now underlie multicore programming across languages.
Key figures
- Maurice Herlihy
- Nir Shavit
- Charles Leiserson
- Vipin Kumar
Related topics
Seminal works
- herlihy2008
- grama2003
- blumofe1996
Frequently asked questions
- What is a data race and why is it dangerous?
- A data race occurs when two threads access the same memory location concurrently and at least one writes, without synchronization. The result is undefined and may vary between runs, producing intermittent, hard-to-reproduce bugs, which is why synchronization is essential in shared-memory programs.