Weakly Informative and Regularizing Priors
Weakly informative priors are deliberately mild, proper priors that rule out implausible parameter values and stabilize estimation without imposing strong substantive beliefs.
Definition
A weakly informative prior is a proper prior chosen to be broad on the scale of plausible parameter values, providing enough information to regularize the posterior and improve computation while remaining uncommitted about the specific value within that range.
Scope
This topic covers the rationale for weakly informative priors over flat priors, their regularizing and shrinkage effects, default choices for regression coefficients and scale parameters, and sparsity-inducing priors such as the horseshoe and the Bayesian Lasso.
Core questions
- Why are weakly informative priors preferred to flat or improper priors in practice?
- How do priors regularize estimates and shrink them toward plausible values?
- What default priors are recommended for regression coefficients and variance parameters?
- How do sparsity priors such as the horseshoe handle many potentially zero coefficients?
Key concepts
- weakly informative prior
- regularization
- shrinkage
- horseshoe prior
- Bayesian Lasso
- scale prior
- separation
Key theories
- Regularization via priors
- A prior with finite scale penalizes extreme estimates, reducing variance and preventing separation problems; many penalized-likelihood estimators correspond to posterior modes under specific priors.
- Global-local shrinkage
- Sparsity priors such as the horseshoe use a heavy-tailed local scale and a global scale so that small coefficients are shrunk strongly while large signals escape shrinkage.
Clinical relevance
Regularizing priors stabilize estimates in high-dimensional and sparse problems such as genomics and biomarker selection, and they prevent divergent estimates when data only weakly identify parameters.
History
As Bayesian computation became routine in the 2000s, attention shifted from flat 'noninformative' priors to weakly informative defaults that improve both inference and sampling. Sparsity priors, including the Bayesian Lasso and the 2010 horseshoe estimator, extended this thinking to high-dimensional regression.
Debates
- How weak should a default prior be?
- There is ongoing discussion about how to set the scale of weakly informative priors so they regularize usefully without unintentionally biasing conclusions on the relevant scale.
Key figures
- Andrew Gelman
- Nicholas Polson
- James Scott
- Carlos Carvalho
Related topics
Seminal works
- gelman2008
- carvalho2010
Frequently asked questions
- How is a weakly informative prior different from a noninformative one?
- A noninformative prior tries to add as little information as possible and may be improper, while a weakly informative prior is proper and intentionally adds mild information to exclude implausible values and stabilize the analysis.