What is the iterator model?

It is a design in which every physical operator implements the same interface — typically open, next, and close — and produces tuples one at a time when its parent calls next. Because all operators share this interface, they can be plugged together into arbitrary plan trees, and tuples flow up the tree on demand.

Why are some operators called blocking?

A blocking operator cannot produce any output until it has consumed all of its input — sorting and certain aggregations are examples, since the final result depends on every input tuple. Blocking operators must materialize their input, whereas pipelined operators such as selection can emit results as they go.

Query Execution Operators

Query execution operators are the physical algorithms — scans, selections, projections, sorts, aggregations, and joins — that a database engine composes into a plan and runs to produce a query's result.

Finn tema med PaperMindSnartFind papers & topics

Tools & resources

Last ned lysbilder

Learn & explore

VideoSnart

Definition

A query execution operator is an implementation of a relational-algebra operation as an algorithm over tuple streams; operators are composed into a tree or pipeline that, when executed, computes the result of a query.

Scope

This topic covers the physical operators of a query engine and how they are organized for execution: the iterator (open-next-close) model, pipelining versus materialization, table and index scans, selection and projection, external merge sort, grouping and aggregation, and duplicate elimination. It treats how operators consume and produce tuple streams and how memory and I/O affect their cost. It excludes the specifics of join algorithms and the optimizer that chooses among operators, which are adjacent topics.

Core questions

How does the iterator (open-next-close) model let operators be composed into a pipeline?
When is a result pipelined versus materialized to disk?
How are selection, projection, and duplicate elimination implemented physically?
How does external merge sort handle data larger than memory?
How do memory budget and I/O cost shape operator performance?

Key concepts

iterator / open-next-close model
table and index scans
selection and projection operators
external merge sort
hash-based grouping
aggregation operators
duplicate elimination
pipelining versus materialization

Key theories

The iterator (Volcano) model: Each operator exposes open, next, and close methods and pulls tuples from its children on demand; this uniform interface lets arbitrary operators be composed into pipelines and is the basis of most query engines.
Pipelining versus materialization: Pipelined operators pass tuples directly to their parent without writing intermediate results, saving I/O, whereas blocking operators such as sort must materialize their input before producing output; the optimizer balances the two.
External sorting and aggregation: When data exceeds memory, external merge sort and hash-based grouping process it in passes over disk; these algorithms underlie ORDER BY, GROUP BY, duplicate elimination, and sort-merge joins.

Clinical relevance

Execution operators are where a query plan's estimated cost becomes real runtime: the choice and implementation of operators, and how well they use memory and avoid unnecessary disk I/O, determine the throughput and latency of every query a database serves.

History

The iterator model was crystallized in Graefe's Volcano query-evaluation framework and his 1993 survey of query-evaluation techniques, which catalogued the physical operators used by relational engines. The model's uniform interface made extensible, composable query engines practical and remains standard, with later work adding vectorized and compiled execution for greater efficiency.

Key figures

Goetz Graefe
Jeffrey D. Ullman

Seminal works

graefe1993
garciamolina2008

Frequently asked questions

What is the iterator model?: It is a design in which every physical operator implements the same interface — typically open, next, and close — and produces tuples one at a time when its parent calls next. Because all operators share this interface, they can be plugged together into arbitrary plan trees, and tuples flow up the tree on demand.
Why are some operators called blocking?: A blocking operator cannot produce any output until it has consumed all of its input — sorting and certain aggregations are examples, since the final result depends on every input tuple. Blocking operators must materialize their input, whereas pipelined operators such as selection can emit results as they go.