What is the difference between overfitting and underfitting?

Underfitting is when a model is too simple to capture the underlying pattern, giving high bias and poor performance even on training data. Overfitting is when a model is so flexible it fits the noise in the training data, giving high variance and poor performance on new data.

How does regularization help?

Regularization adds a penalty on model complexity, discouraging extreme or numerous parameters. This reduces variance, usually at the cost of a small increase in bias, and so lowers the total error on unseen data when complexity would otherwise be too high.

Bias-Variance and Overfitting

The bias-variance trade-off explains how model complexity controls prediction error, with overfitting and underfitting as the two failure modes a learner must balance.

یافتن موضوع با PaperMindبه‌زودیFind papers & topics

Tools & resources

دریافت اسلایدها

Learn & explore

ویدیوبه‌زودی

Definition

The bias-variance trade-off is the principle that expected prediction error decomposes into bias, the error from a model being too simple to capture the truth, and variance, the error from a model being too sensitive to the particular training sample, with model complexity moving error between the two.

Scope

This topic covers the decomposition of expected prediction error into bias, variance, and irreducible noise; the meaning of overfitting and underfitting; and the role of regularization in shifting the balance. It also covers the classical U-shaped error curve and recent observations of double descent in heavily overparameterized models.

Core questions

How does expected error decompose into bias, variance, and noise?
What characterizes overfitting versus underfitting?
How does regularization shift the bias-variance balance?
Why can very flexible models sometimes generalize despite high capacity?

Key theories

Bias-variance decomposition: For squared-error loss, expected error splits into squared bias, variance, and irreducible noise, making explicit how simplifying assumptions reduce variance at the cost of bias and vice versa.
Overfitting and regularization: Overfitting occurs when a model captures noise rather than signal; regularization penalizes complexity to reduce variance, trading a small increase in bias for a larger decrease in variance.
Beyond the classical trade-off: In very overparameterized regimes the error can decrease again past the interpolation point, the double-descent phenomenon, complicating the classical picture of a single U-shaped curve.

Clinical relevance

The bias-variance trade-off is the practical heart of model fitting, guiding choices of model size, regularization strength, and feature count to minimize error on new data; diagnosing whether a model is underfitting or overfitting is a routine and essential step in applied machine learning.

History

The bias-variance decomposition was articulated for neural networks and learning by Geman and colleagues around 1992 and became a standard lens in statistics and machine learning. Regularization theory formalized complexity control, and the recent double-descent findings have prompted a reexamination of the trade-off for modern overparameterized models.

Key figures

Stuart Geman
Trevor Hastie
Christopher Bishop

Seminal works

hastie2009
bishop2006
geman1992

Frequently asked questions

What is the difference between overfitting and underfitting?: Underfitting is when a model is too simple to capture the underlying pattern, giving high bias and poor performance even on training data. Overfitting is when a model is so flexible it fits the noise in the training data, giving high variance and poor performance on new data.
How does regularization help?: Regularization adds a penalty on model complexity, discouraging extreme or numerous parameters. This reduces variance, usually at the cost of a small increase in bias, and so lowers the total error on unseen data when complexity would otherwise be too high.