Why not just use a flat prior on the group-level variance?

A flat or default inverse-gamma prior can place excessive weight near zero or fail to be proper, producing collapsed or unstable posteriors when groups are few; weakly informative scale priors such as the half-Cauchy behave more reliably.

Hyperpriors and Shrinkage — ScholarGate Atlas

Definition

A hyperprior is a prior distribution on the hyperparameters that govern the distribution of group-level parameters; together with the data it determines the posterior for the group-level variance and hence the degree of shrinkage applied to each group.

Scope

This topic covers the specification of priors for hierarchical means and especially variance components, the way the group-level variance governs shrinkage, the danger of degenerate posteriors from poor variance priors, and recommended weakly informative choices such as half-Cauchy and half-normal priors.

Core questions

Why does the group-level variance control the amount of shrinkage?
What goes wrong when an inappropriate prior is used for a variance component?
Which weakly informative hyperpriors are recommended for scale parameters?
How does shrinkage relate to the Stein and empirical Bayes results?

Key concepts

hyperprior
variance component
half-Cauchy prior
inverse-gamma prior
shrinkage
James-Stein estimator
degenerate posterior

Key theories

Variance-component priors: The hyperprior on the group-level standard deviation strongly influences inference when groups are few; folded-noncentral and half-Cauchy priors avoid the pathologies of conventional inverse-gamma choices.
Shrinkage as risk reduction: Shrinking many related estimates toward a common center lowers total mean squared error, the same principle that makes the James-Stein estimator dominate the sample mean.

Clinical relevance

Sensible hyperpriors prevent overconfident or unstable estimates of between-group variation in meta-analysis and multi-site studies, where the number of groups is often small and the variance is hard to estimate.

History

Shrinkage estimation grew from Stein's 1956 result and the empirical Bayes work of Efron and Morris in the 1970s. Gelman's 2006 analysis of variance-parameter priors clarified how hyperprior choice shapes shrinkage in fully Bayesian hierarchical models.

Debates

Which prior for the group-level variance?: Conventional inverse-gamma priors can be unintentionally informative near zero, so there is ongoing discussion about half-Cauchy, half-normal, and other weakly informative scale priors.

Key figures

Andrew Gelman
Bradley Efron
Carl Morris
Charles Stein

Seminal works

gelman2006
efron1975

Frequently asked questions

Why not just use a flat prior on the group-level variance?: A flat or default inverse-gamma prior can place excessive weight near zero or fail to be proper, producing collapsed or unstable posteriors when groups are few; weakly informative scale priors such as the half-Cauchy behave more reliably.

Hyperpriors and Shrinkage