ScholarGate
Асистент

Recovery and Logging

Recovery and logging mechanisms guarantee the atomicity and durability of transactions by recording changes in a log so that, after a crash, committed work can be redone and uncommitted work undone.

Знайти тему у PaperMindНезабаромFind papers & topics
Tools & resources
Завантажити слайди
Learn & explore
ВідеоНезабаром

Definition

Database recovery is the process of restoring the database to a consistent state after a failure, ensuring committed transactions' effects are durable and aborted or in-flight transactions leave no trace; logging is the technique of recording transaction actions to a durable log to make this possible.

Scope

This topic covers how a database survives failures: the write-ahead logging (WAL) protocol, undo and redo information, checkpoints to bound recovery work, and the standard recovery algorithm (notably ARIES) with its analysis, redo, and undo passes. It treats the buffer-management policies (steal/no-steal, force/no-force) that determine what logging is required. It excludes the concurrency-control protocols that run during normal operation and distributed commit, which are adjacent topics.

Core questions

  • Why must the log record reach durable storage before the data it describes (write-ahead logging)?
  • How do undo and redo restore a consistent state after a crash?
  • How do buffer-management policies (steal/force) determine logging requirements?
  • What role do checkpoints play in bounding recovery time?
  • How does the ARIES algorithm structure recovery into analysis, redo, and undo?

Key concepts

  • write-ahead logging (WAL)
  • undo and redo logging
  • log sequence number
  • checkpoints
  • steal/no-steal and force/no-force policies
  • compensation log records
  • analysis, redo, undo passes
  • ARIES

Key theories

Write-ahead logging
The WAL protocol requires that log records describing a change be forced to stable storage before the corresponding data page, ensuring that after a crash the system has enough information to undo uncommitted and redo committed changes.
Undo/redo recovery and buffer policies
Whether the system needs undo, redo, or both depends on buffer policies: a steal policy (writing uncommitted pages to disk) requires undo, and a no-force policy (not forcing committed pages at commit) requires redo; the common steal/no-force combination requires both.
ARIES
ARIES is the widely adopted recovery method that uses write-ahead logging, log sequence numbers, and a three-pass (analysis, redo, undo) algorithm with compensation log records to support fine-grained locking and partial rollbacks.

Clinical relevance

Recovery and logging are what make durability real: they ensure that once a system confirms a transaction such as a payment or order, that fact survives power loss and crashes, and that a crash mid-transaction never leaves the database in a half-updated, inconsistent state.

History

Härder and Reuter's 1983 survey laid out the principles of transaction-oriented recovery and the buffer-policy taxonomy. ARIES, developed by C. Mohan and colleagues at IBM and published in 1992, became the de facto standard recovery algorithm, combining write-ahead logging with log sequence numbers and compensation records to support fine-granularity locking.

Key figures

  • C. Mohan
  • Jim Gray
  • Theo Härder
  • Andreas Reuter

Related topics

Seminal works

  • mohan1992
  • haerder1983
  • gray1992

Frequently asked questions

Why is write-ahead logging necessary?
Because the database may write a modified page to disk before the transaction commits, or hold a committed page in memory at crash time. Forcing the log record before the data page guarantees that, whatever the buffer manager did, recovery has enough information to undo uncommitted changes and redo committed ones to reach a consistent state.
What do checkpoints accomplish?
A checkpoint periodically records which transactions are active and flushes bookkeeping to the log, giving recovery a recent, known-good starting point. Without checkpoints, recovery might have to scan the entire log from the beginning; checkpoints bound how far back recovery must process, keeping restart time manageable.

Methods for this concept

Related concepts