How is a weakly informative prior different from a noninformative one?

A noninformative prior tries to add as little information as possible and may be improper, while a weakly informative prior is proper and intentionally adds mild information to exclude implausible values and stabilize the analysis.

Weakly Informative and Regularizing Priors

Weakly informative priors are deliberately mild, proper priors that rule out implausible parameter values and stabilize estimation without imposing strong substantive beliefs.

Definition

A weakly informative prior is a proper prior chosen to be broad on the scale of plausible parameter values, providing enough information to regularize the posterior and improve computation while remaining uncommitted about the specific value within that range.

Scope

This topic covers the rationale for weakly informative priors over flat priors, their regularizing and shrinkage effects, default choices for regression coefficients and scale parameters, and sparsity-inducing priors such as the horseshoe and the Bayesian Lasso.

Core questions

Why are weakly informative priors preferred to flat or improper priors in practice?
How do priors regularize estimates and shrink them toward plausible values?
What default priors are recommended for regression coefficients and variance parameters?
How do sparsity priors such as the horseshoe handle many potentially zero coefficients?

Key concepts

weakly informative prior
regularization
shrinkage
horseshoe prior
Bayesian Lasso
scale prior
separation

Key theories

Regularization via priors: A prior with finite scale penalizes extreme estimates, reducing variance and preventing separation problems; many penalized-likelihood estimators correspond to posterior modes under specific priors.
Global-local shrinkage: Sparsity priors such as the horseshoe use a heavy-tailed local scale and a global scale so that small coefficients are shrunk strongly while large signals escape shrinkage.

Clinical relevance

Regularizing priors stabilize estimates in high-dimensional and sparse problems such as genomics and biomarker selection, and they prevent divergent estimates when data only weakly identify parameters.

History

As Bayesian computation became routine in the 2000s, attention shifted from flat 'noninformative' priors to weakly informative defaults that improve both inference and sampling. Sparsity priors, including the Bayesian Lasso and the 2010 horseshoe estimator, extended this thinking to high-dimensional regression.

Debates

How weak should a default prior be?: There is ongoing discussion about how to set the scale of weakly informative priors so they regularize usefully without unintentionally biasing conclusions on the relevant scale.

Key figures

Andrew Gelman
Nicholas Polson
James Scott
Carlos Carvalho

Seminal works

gelman2008
carvalho2010

Frequently asked questions

How is a weakly informative prior different from a noninformative one?: A noninformative prior tries to add as little information as possible and may be improper, while a weakly informative prior is proper and intentionally adds mild information to exclude implausible values and stabilize the analysis.