Why use QR instead of the normal equations for least squares?

Forming the normal equations squares the condition number of the matrix, which can destroy accuracy. A QR factorization works with the original matrix through orthogonal transformations and is backward stable, so it gives more reliable least-squares solutions.

What makes the SVD so widely used?

The SVD simultaneously reveals rank, norm, condition number, and the optimal low-rank approximation of a matrix, all through orthogonal factors that are numerically well-behaved, which is why it underpins data compression, denoising, and dimensionality reduction.

Matrix Factorizations

Matrix factorizations express a matrix as a product of simpler factors — triangular, orthogonal, or diagonal — from which solutions, least-squares fits, and spectral information can be read off stably.

Definition

A matrix factorization is a representation of a matrix as a product of two or more matrices with special structure, chosen so that fundamental problems — solving systems, fitting data, computing rank or norms — become straightforward and numerically stable.

Scope

This topic covers the QR factorization (via Householder reflections and Givens rotations or Gram-Schmidt), the Cholesky factorization for symmetric positive-definite matrices, and the singular value decomposition, together with their use in least-squares problems, rank determination, and low-rank approximation.

Core questions

How does the QR factorization solve least-squares problems without forming the ill-conditioned normal equations?
When is the Cholesky factorization applicable, and why is it cheaper and more stable than general LU for symmetric positive-definite matrices?
What information about rank, norm, and low-rank approximation does the singular value decomposition reveal?
Which orthogonalization scheme — Householder, Givens, or Gram-Schmidt — best preserves accuracy?

Key theories

QR factorization and orthogonalization: Any matrix can be written as A = QR with Q having orthonormal columns and R upper-triangular; computed with Householder reflections it is backward stable and provides the standard route to linear least-squares solutions.
Singular value decomposition: Every matrix factors as A = U S V-transpose with U, V orthogonal and S diagonal with nonnegative singular values; the SVD reveals rank, the 2-norm and condition number, the four fundamental subspaces, and the optimal low-rank approximation via the Eckart-Young theorem.
Cholesky factorization: A symmetric positive-definite matrix factors as A = L L-transpose with L lower-triangular; the factorization requires no pivoting for stability and costs about half as much as general LU.

Mechanisms

Householder QR introduces zeros below the diagonal one column at a time using orthogonal reflections, accumulating Q implicitly; Givens rotations zero individual entries and suit sparse or updating contexts. Cholesky exploits symmetry and positive-definiteness to compute L directly. The SVD is computed in two phases — bidiagonalization by orthogonal transformations followed by an iterative diagonalization of the bidiagonal form — keeping all intermediate quantities well-conditioned.

Clinical relevance

Matrix factorizations are the engines behind least-squares data fitting, principal component analysis and dimensionality reduction, regularization of ill-posed inverse problems, recommender systems, and model order reduction, where the SVD in particular provides the mathematically optimal low-rank summary of high-dimensional data.

History

The systematic numerical use of orthogonal factorizations grew in the 1950s and 1960s with Householder's reflections and the Golub-Kahan algorithm for the SVD, which turned the singular value decomposition from a theoretical tool into a routinely computable one and made it central to least-squares and data analysis.

Key figures

Alston S. Householder
Gene H. Golub
William Kahan

Seminal works

trefethen1997
golub2013

Frequently asked questions

Why use QR instead of the normal equations for least squares?: Forming the normal equations squares the condition number of the matrix, which can destroy accuracy. A QR factorization works with the original matrix through orthogonal transformations and is backward stable, so it gives more reliable least-squares solutions.
What makes the SVD so widely used?: The SVD simultaneously reveals rank, norm, condition number, and the optimal low-rank approximation of a matrix, all through orthogonal factors that are numerically well-behaved, which is why it underpins data compression, denoising, and dimensionality reduction.