Why not just solve the normal equations directly?

The normal equations involve the product of the design matrix with its transpose, which squares the condition number and can severely degrade accuracy for moderately ill-conditioned problems. Solving via QR or the SVD works with the original matrix and is far more stable.

How does least-squares differ from minimax approximation?

Least squares minimizes the sum (or integral) of squared errors, which spreads the error and is sensitive to outliers, whereas minimax minimizes the largest error. Least squares leads to linear equations and is easier to compute; minimax gives uniformly small error.

Least-Squares Approximation

Least-squares approximation finds the function or parameter vector that minimizes the sum of squared residuals, providing the optimal fit in the Euclidean (L2) sense and the standard tool for fitting models to noisy or overdetermined data.

Nájsť tému v PaperMindČoskoroFind papers & topics

Tools & resources

Stiahnuť snímky

Learn & explore

VideoČoskoro

Definition

Least-squares approximation is the determination of the element of a chosen approximating set that minimizes the L2 norm (sum or integral of squared residuals) of the discrepancy from a target function or data set.

Scope

This topic covers the linear least-squares problem, its characterization through the normal equations and orthogonal projection, stable solution by QR factorization and the singular value decomposition, continuous L2 approximation by orthogonal polynomials, and the conditioning issues that distinguish reliable from unreliable solution methods.

Core questions

How is the least-squares solution characterized geometrically as an orthogonal projection?
Why do the normal equations solve the problem in principle but threaten accuracy in practice?
How do QR factorization and the SVD provide stable solutions, and when is the SVD essential?
How do orthogonal polynomials make continuous least-squares approximation well-conditioned?

Key theories

Normal equations and orthogonal projection: The least-squares solution makes the residual orthogonal to the approximating subspace, which yields the normal equations; geometrically the best approximation is the orthogonal projection of the data onto that subspace.
Stable solution via orthogonal factorization: Because forming the normal equations squares the condition number, accurate least-squares solutions are computed via QR factorization or, for rank-deficient or near-singular problems, via the singular value decomposition and its associated pseudoinverse.

Mechanisms

For a discrete overdetermined system, a QR factorization of the design matrix reduces the least-squares problem to a well-conditioned triangular solve, avoiding the squared conditioning of the normal equations. For rank-deficient problems the SVD yields the minimum-norm least-squares solution through the Moore-Penrose pseudoinverse and exposes near-rank-deficiency through small singular values. In the continuous setting, expanding in orthogonal polynomials diagonalizes the problem so the coefficients are computed independently as inner products.

Clinical relevance

Least squares is the backbone of data fitting and regression across the sciences and engineering, of parameter estimation and calibration, of signal and image reconstruction, and of the linearized subproblems inside nonlinear optimization; its conditioning analysis guides regularization choices when data are noisy or the model is over-parameterized.

History

The method of least squares was published by Legendre in 1805 and developed with a probabilistic justification by Gauss; its numerical treatment was transformed in the twentieth century by orthogonal-factorization algorithms, especially the Golub-led use of QR and the SVD, which replaced the unstable normal-equations approach in high-quality software.

Key figures

Carl Friedrich Gauss
Adrien-Marie Legendre
Gene H. Golub
Ake Bjorck

Seminal works

bjorck1996
trefethen1997

Frequently asked questions

Why not just solve the normal equations directly?: The normal equations involve the product of the design matrix with its transpose, which squares the condition number and can severely degrade accuracy for moderately ill-conditioned problems. Solving via QR or the SVD works with the original matrix and is far more stable.
How does least-squares differ from minimax approximation?: Least squares minimizes the sum (or integral) of squared errors, which spreads the error and is sensitive to outliers, whereas minimax minimizes the largest error. Least squares leads to linear equations and is easier to compute; minimax gives uniformly small error.