ScholarGate
Ассистент

Cryptographic Hash Functions

A cryptographic hash function compresses arbitrary input into a short fixed-length digest in a way that is one-way and collision-resistant, serving as a basic building block for integrity, authentication, and many higher-level protocols.

Найти тему в PaperMindСкороFind papers & topics
Tools & resources
Скачать слайды
Learn & explore
ВидеоСкоро

Definition

A cryptographic hash function is a deterministic function mapping inputs of arbitrary length to a fixed-length output (digest) such that it is infeasible to find a preimage, a second preimage, or any two distinct inputs producing the same output.

Scope

This topic covers the security properties of hash functions (preimage, second-preimage, and collision resistance), their internal constructions (Merkle-Damgaard, sponge), the standardized families SHA-2 and SHA-3, and applications such as integrity verification, password storage, commitment, and proof-of-work. It addresses cryptanalytic attacks including the birthday bound and the breaks of MD5 and SHA-1. It excludes keyed authentication codes, which are treated under message authentication codes, though many MACs are built from hash functions.

Core questions

  • What distinguishes a cryptographic hash function from an ordinary checksum or hash table function?
  • Why are preimage, second-preimage, and collision resistance distinct, and how do they relate?
  • Why does the birthday paradox bound collision resistance at half the digest length?
  • How do Merkle-Damgaard and sponge constructions extend a fixed-size compression to arbitrary input?
  • How are hash functions used to commit to data, verify integrity, and store passwords?

Key concepts

  • message digest
  • preimage resistance
  • second-preimage resistance
  • collision resistance
  • birthday attack
  • Merkle-Damgaard construction
  • sponge construction (Keccak)
  • length-extension attack
  • proof-of-work

Key theories

Collision resistance and the birthday bound
Because of the birthday paradox, a generic collision-finding attack on an n-bit hash succeeds in about 2^(n/2) work; collision resistance therefore requires digests roughly twice the desired security level (e.g., 256 bits for 128-bit security).
Iterated and sponge constructions
The Merkle-Damgaard paradigm builds a full hash by iterating a fixed-input compression function over message blocks; the sponge construction used by SHA-3 (Keccak) absorbs input into and squeezes output from a large permutation state, avoiding length-extension weaknesses.

Mechanisms

A Merkle-Damgaard hash pads the message, splits it into blocks, and iteratively applies a compression function that mixes each block into a running chaining value, outputting the final value as the digest. SHA-3's sponge construction instead absorbs message blocks by XORing them into part of a large state permuted by the Keccak-f function, then squeezes out digest bits, which avoids the length-extension property inherent to plain Merkle-Damgaard.

Clinical relevance

Hash functions are pervasive: they verify software-download integrity, anchor digital signatures (which sign a hash, not the whole document), index content in Git and content-delivery networks, secure password storage via slow key-derivation functions (bcrypt, Argon2), and provide the proof-of-work and Merkle trees underlying blockchains. The practical break of MD5 and SHA-1 collision resistance forced migration to SHA-2 across the internet.

Evidence & guidelines

SHA-2 (SHA-256, SHA-512) is specified in FIPS 180-4 and SHA-3 in FIPS 202; both are NIST-approved. MD5 and SHA-1 are deprecated for security use after demonstrated collision attacks (the 2017 SHAttered SHA-1 collision being a landmark). For password storage, standards recommend memory-hard functions such as Argon2 rather than plain hashing.

History

The MD family (Rivest's MD4 and MD5) and the NSA-designed SHA-0/SHA-1 dominated the 1990s. Xiaoyun Wang's collision attacks (2004-2005) broke MD5 and weakened SHA-1, prompting NIST's open SHA-3 competition (2007-2012), won by the Keccak sponge design of Bertoni, Daemen, Peeters, and Van Assche. SHA-1 was practically broken by the SHAttered collision in 2017.

Key figures

  • Ralph Merkle
  • Ivan Damgaard
  • Guido Bertoni
  • Joan Daemen
  • Xiaoyun Wang

Related topics

Seminal works

  • nist2015sha3
  • katz2020
  • menezes1996

Frequently asked questions

Why can't I just hash a password and store the digest?
Plain cryptographic hashes are fast, which lets attackers try billions of guesses per second against a stolen database. Password storage instead uses deliberately slow, salted, memory-hard functions (bcrypt, scrypt, Argon2) so that brute-forcing each password is expensive.
Is SHA-256 broken now that SHA-1 is?
No. SHA-1 and SHA-256 are different algorithms; the collision attacks on SHA-1 do not transfer to SHA-2. SHA-256 remains unbroken and is the recommended general-purpose hash, alongside SHA-3.