If instructions execute out of order, how is the program still correct?

The processor only reorders execution, not results. Register renaming ensures it respects true data dependencies, and a reorder buffer commits results in the original program order, so the visible architectural state is exactly what sequential execution would produce.

What is register renaming?

Register renaming maps the architectural registers named in instructions onto a larger pool of physical registers. This removes false (name) dependencies that arise from reusing register names, letting more independent instructions execute in parallel without conflicting over register storage.

Out-of-Order Execution

Out-of-order execution lets a processor execute instructions as soon as their operands are ready rather than strictly in program order, using register renaming and buffering to expose parallelism while still producing in-order results.

Onderwerp vinden met PaperMindBinnenkortFind papers & topics

Tools & resources

Dia's downloaden

Learn & explore

VideoBinnenkort

Definition

Out-of-order execution is a microarchitectural technique in which a processor dynamically reorders instruction execution to run independent instructions whenever their operands are available, while using buffering and renaming to preserve the appearance of sequential, in-order completion.

Scope

This topic covers dynamic scheduling: reservation stations, register renaming to remove false dependencies, the reorder buffer that enforces in-order commit, and speculative execution recovery. It builds directly on Tomasulo's algorithm. It excludes the prediction of branch direction (branch prediction) and the broader notion of how much parallelism exists (instruction-level parallelism), focusing on the reordering machinery itself.

Core questions

How can a processor execute instructions out of program order yet produce correct, in-order results?
How does register renaming eliminate write-after-read and write-after-write dependencies?
What do reservation stations and the reorder buffer contribute?
How are exceptions and mispredictions recovered precisely in an out-of-order core?

Key concepts

dynamic scheduling
reservation stations
register renaming
common data bus
reorder buffer
in-order commit
precise exceptions
speculative execution recovery

Key theories

Tomasulo's algorithm: Tomasulo's scheme uses reservation stations and a common data bus to rename registers and dispatch instructions to functional units as their operands become available, allowing out-of-order execution that respects only true data dependencies.
Precise exceptions via in-order commit: A reorder buffer holds results of out-of-order instructions and commits them in program order, so that exceptions and mispredictions can be handled precisely as if execution had been strictly sequential.

Mechanisms

Decoded instructions are renamed to physical registers and placed in reservation stations or an issue queue, where they wait for operands broadcast on a result bus. When ready, they execute on available functional units in any order. A reorder buffer tracks original program order and commits results sequentially, discarding speculative results on a misprediction or exception so that architectural state is always precise.

Clinical relevance

Out-of-order execution is the core engine of high-performance CPUs, hiding memory and execution latencies by finding independent work to do. It also has security implications: because speculative, out-of-order work can transiently access data before being squashed, it enabled the Spectre and Meltdown classes of microarchitectural side-channel attacks.

History

Tomasulo introduced dynamic scheduling in the IBM System/360 Model 91 in 1967. The combination with the reorder buffer for precise exceptions, developed in the 1980s, made out-of-order execution practical for general-purpose processors, and it became standard in high-performance designs from the mid-1990s onward.

Debates

Performance versus complexity, power, and security: Out-of-order execution delivers strong single-thread performance but at significant cost in hardware complexity and power, and its speculation has proven exploitable; this fuels debate over how aggressively to speculate versus favoring simpler, more efficient, or more secure designs.

Key figures

Robert Tomasulo
Yale Patt
James E. Smith
John L. Hennessy

Seminal works

tomasulo1967
hennessy2019

Frequently asked questions

If instructions execute out of order, how is the program still correct?: The processor only reorders execution, not results. Register renaming ensures it respects true data dependencies, and a reorder buffer commits results in the original program order, so the visible architectural state is exactly what sequential execution would produce.
What is register renaming?: Register renaming maps the architectural registers named in instructions onto a larger pool of physical registers. This removes false (name) dependencies that arise from reusing register names, letting more independent instructions execute in parallel without conflicting over register storage.