What is a data race and why is it dangerous?

A data race occurs when two threads access the same memory location concurrently and at least one writes, without synchronization. The result is undefined and may vary between runs, producing intermittent, hard-to-reproduce bugs, which is why synchronization is essential in shared-memory programs.

Shared Memory and Data Parallelism

Shared-memory parallelism has multiple threads operate on a common address space, while data parallelism applies the same operation across many data elements at once.

PaperMind(으)로 주제 찾기곧 제공Find papers & topics

Tools & resources

슬라이드 다운로드

Learn & explore

동영상곧 제공

Definition

In the shared-memory model, parallel threads of execution communicate implicitly by reading and writing a common memory, coordinating through synchronization primitives; data parallelism is the special case in which the same computation is applied independently across the elements of a data structure.

Scope

This topic covers the shared-address-space model and its programming interfaces (threads, OpenMP), the synchronization primitives that coordinate threads (locks, barriers, atomics) and the hazards they manage (data races, deadlock, false sharing), data-parallel and loop-parallel constructs, and task-based runtimes with work-stealing schedulers. It treats the programming side whose theoretical foundations appear under shared-memory models.

Core questions

How do threads coordinate access to shared data without races or deadlock?
How can loops and array operations be expressed for parallel execution?
How are dynamically generated tasks balanced across cores efficiently?

Key theories

Synchronization and memory consistency: Correct shared-memory programs rely on synchronization primitives and an understanding of the memory model that governs when one thread's writes become visible to another, with mismanagement causing races or deadlock.
Directive-based data parallelism: OpenMP lets programmers annotate sequential code with directives that parallelize loops and regions over a shared address space, providing a portable, incremental path to shared-memory parallelism.
Work-stealing task scheduling: Task-based runtimes schedule dynamically created tasks by having idle processors steal work from busy ones, achieving provably good load balance and bounded overhead for irregular parallel computations.

Clinical relevance

Shared-memory and data-parallel programming is how everyday software exploits multicore CPUs and how scientific kernels, simulations, and array-heavy workloads achieve speedups on a single node, making it the most widely encountered form of parallel programming.

History

Shared-memory multiprocessing matured with the threads model and the 1998 OpenMP standard for portable directive-based parallelism; work-stealing schedulers from the Cilk project (1996 onward) provided efficient task-based parallelism, and these ideas now underlie multicore programming across languages.

Key figures

Maurice Herlihy
Nir Shavit
Charles Leiserson
Vipin Kumar

Seminal works

herlihy2008
grama2003
blumofe1996

Frequently asked questions

What is a data race and why is it dangerous?: A data race occurs when two threads access the same memory location concurrently and at least one writes, without synchronization. The result is undefined and may vary between runs, producing intermittent, hard-to-reproduce bugs, which is why synchronization is essential in shared-memory programs.