When should I use variational inference instead of MCMC?

Variational inference is attractive when datasets or models are too large for MCMC to be feasible and a fast, approximate posterior is acceptable; MCMC remains preferable when accurate uncertainty quantification is essential, because variational methods tend to underestimate posterior variance.

Variational Inference

Variational inference turns posterior approximation into optimization, fitting a simpler distribution to the posterior by maximizing a lower bound on the marginal likelihood.

Najít téma v PaperMindJiž brzyFind papers & topics

Tools & resources

Stáhnout prezentaci

Learn & explore

VideoJiž brzy

Definition

Variational inference approximates an intractable posterior by selecting, from a tractable family of distributions, the member that minimizes the Kullback-Leibler divergence to the posterior, equivalently by maximizing the evidence lower bound on the log marginal likelihood.

Scope

This topic covers the variational objective (the evidence lower bound), the mean-field family and its factorization assumptions, coordinate-ascent and stochastic gradient algorithms, and the trade-offs between speed and the systematic biases of approximate inference.

Core questions

How is posterior approximation framed as an optimization problem?
What is the evidence lower bound and how is it related to the KL divergence?
What does the mean-field assumption sacrifice in exchange for tractability?
How do stochastic and black-box methods scale variational inference to large data?

Key concepts

evidence lower bound
Kullback-Leibler divergence
mean-field family
coordinate-ascent variational inference
stochastic variational inference
black-box variational inference
variance underestimation

Key theories

Evidence lower bound: Maximizing the ELBO is equivalent to minimizing the KL divergence from the approximation to the posterior, recasting inference as a tractable optimization over a chosen family.
Mean-field approximation: Assuming the approximate posterior factorizes across parameter blocks yields closed-form coordinate-ascent updates but tends to underestimate posterior variance and ignore dependencies.

Clinical relevance

Variational inference scales Bayesian methods to large datasets and complex models in text analysis, genomics, and deep learning, where the cost of full MCMC would be prohibitive and a fast approximate posterior suffices.

History

Variational methods entered machine learning through mean-field approximations for graphical models in the late 1990s. Stochastic and automatic variational inference in the 2010s, surveyed by Blei and colleagues in 2017, brought scalable approximate Bayesian inference to mainstream statistics and probabilistic programming.

Debates

Bias of approximate posteriors: Variational inference is fast but its KL objective systematically understates uncertainty, so the reliability of its approximate posteriors relative to asymptotically exact MCMC is debated.

Key figures

Michael Jordan
Zoubin Ghahramani
David Blei
Tommi Jaakkola

Seminal works

blei2017
jordan1999

Frequently asked questions

When should I use variational inference instead of MCMC?: Variational inference is attractive when datasets or models are too large for MCMC to be feasible and a fast, approximate posterior is acceptable; MCMC remains preferable when accurate uncertainty quantification is essential, because variational methods tend to underestimate posterior variance.