Why would a biased estimator ever be preferred?

Because mean squared error combines bias and variance; a small bias that buys a large reduction in variance can lower total error, which is exactly what shrinkage estimators exploit.

Is Stein's paradox really a paradox?

It is surprising rather than contradictory: it shows that estimating several unrelated means is improved by shrinking them jointly, because the combined risk, not each separate estimate, is what is reduced.

Bayes and Shrinkage Estimation

Bayes estimators blend prior belief with data to minimize average risk, and shrinkage estimators exploit the surprising fact that pulling estimates toward a center can dominate the obvious estimator.

Nájsť tému v PaperMindČoskoroFind papers & topics

Tools & resources

Stiahnuť snímky

Learn & explore

VideoČoskoro

Definition

A Bayes estimator minimizes the expected loss averaged over a prior distribution on the parameter; a shrinkage estimator deliberately biases an estimate toward a fixed point or common mean to reduce its overall mean squared error.

Scope

This topic covers prior distributions and the posterior, Bayes estimators as posterior means under squared-error loss and other loss functions, the relationship between Bayes risk and frequentist risk, the James-Stein estimator and Stein's paradox of inadmissibility in three or more dimensions, empirical Bayes and hierarchical shrinkage, and the bias-variance trade-off that makes shrinkage advantageous.

Core questions

How is a Bayes estimator derived from the posterior distribution under a given loss function?
Why does the James-Stein estimator dominate the sample mean in three or more dimensions?
How does empirical Bayes borrow strength across related estimation problems?
When does the bias introduced by shrinkage pay off in reduced risk?

Key theories

Bayes estimators and posterior expectation: Under squared-error loss the Bayes estimator is the posterior mean; for other losses it is the corresponding posterior summary, and it minimizes the Bayes risk averaged over the prior.
Stein's paradox and the James-Stein estimator: When estimating three or more means simultaneously, the sample mean is inadmissible under squared-error loss, and the James-Stein estimator that shrinks toward a common point has uniformly smaller risk.

Clinical relevance

Shrinkage and empirical-Bayes estimators improve accuracy when many related quantities are estimated at once, as in small-area estimation, sports and education rankings, genomics, and ridge and regularized regression, where pooling information across units beats treating each in isolation.

History

Stein showed in 1956 that the usual estimator of a multivariate normal mean is inadmissible in three or more dimensions, and James and Stein exhibited a dominating estimator in 1961. Efron and Morris reframed the result through empirical Bayes in the 1970s, making shrinkage a practical tool.

Key figures

Charles Stein
Willard James
Bradley Efron
James O. Berger

Seminal works

berger1985

Frequently asked questions

Why would a biased estimator ever be preferred?: Because mean squared error combines bias and variance; a small bias that buys a large reduction in variance can lower total error, which is exactly what shrinkage estimators exploit.
Is Stein's paradox really a paradox?: It is surprising rather than contradictory: it shows that estimating several unrelated means is improved by shrinking them jointly, because the combined risk, not each separate estimate, is what is reduced.