What does an I-squared of 75% mean?

It indicates that about three-quarters of the total variation among study estimates reflects genuine between-study differences rather than sampling error; but because I-squared depends on study precision, it should be interpreted alongside the actual spread of effects, not against a fixed label.

Is high heterogeneity a reason not to pool studies?

Not automatically. High heterogeneity signals that studies differ and prompts investigation of why, but whether to pool, to use a random-effects model, or to refrain depends on whether the differences are explicable and the studies clinically comparable.

Heterogeneity in Meta-Analysis

Heterogeneity in meta-analysis is the variation in true effects across the studies being pooled, beyond what sampling error alone would produce. Measuring and interpreting it tells the analyst whether the studies are estimating essentially the same thing or genuinely different things, which shapes both the model used and the confidence placed in the summary.

用 PaperMind 寻找选题即将推出Find papers & topics

Tools & resources

下载幻灯片

Learn & explore

视频即将推出

Definition

Heterogeneity is the degree to which the true effects estimated by individual studies in a meta-analysis differ from one another, quantified by statistics such as Cochran's Q, I-squared (the proportion of total variation due to between-study differences rather than chance), and tau-squared (the estimated between-study variance).

Scope

This entry covers the statistical assessment of between-study heterogeneity: the Cochran Q test, the I-squared statistic, the between-study variance tau-squared, and the known limitations of these measures. It treats heterogeneity as a methodological topic within evidence synthesis and offers reference description, not clinical advice.

Core questions

Do the included studies estimate one common effect or a range of different effects?
How much of the observed variation is real between-study difference versus sampling noise?
How should I-squared and tau-squared be interpreted, and where do they mislead?
When does heterogeneity make a single pooled estimate inappropriate?

Key concepts

Cochran's Q test
I-squared statistic
Tau-squared (between-study variance)
Clinical versus statistical heterogeneity
Prediction interval
Subgroup analysis as a response to heterogeneity

Mechanisms

The total variation among study estimates is partitioned into within-study sampling error and genuine between-study variation. Cochran's Q compares observed dispersion against what sampling error alone predicts; because Q has low power with few studies, Higgins and Thompson proposed I-squared, the percentage of total variation attributable to between-study heterogeneity rather than chance, which is independent of the number of studies. Tau-squared estimates the variance of the underlying effect distribution and feeds directly into random-effects weighting and prediction intervals. Important caveats follow: Rücker and colleagues show that I-squared depends on the precision of the included studies, so it can be large simply because studies are precise, and von Hippel shows it is unstable and can be biased in small meta-analyses, so these statistics must be read alongside the absolute spread of effects rather than against fixed thresholds.

Clinical relevance

Whether and how a body of trials is summarised depends heavily on its heterogeneity, so appraising heterogeneity statistics is part of judging how much weight a pooled result deserves in guidelines and health technology assessment. This entry describes how heterogeneity is measured and is not a basis for individual clinical decisions.

Evidence & guidelines

The Cochrane Handbook describes expected practice for assessing and reporting heterogeneity, including the use of I-squared with cautionary interpretation and the role of prediction intervals, consistent with the methodological literature summarised here.

History

Cochran's Q test for combining experiments dates from mid-twentieth-century statistics, but it proved underpowered for the small numbers of studies common in clinical meta-analysis. Higgins and Thompson's 2002 paper, followed by the widely cited 2003 BMJ exposition, introduced I-squared as an interpretable, sample-size-independent measure, after which a corrective literature (Rücker et al., 2008; von Hippel, 2015) clarified its dependence on study precision and its instability in small syntheses.

Debates

How much should I-squared be relied on to judge heterogeneity?: I-squared depends on the precision of the included studies and can be unstable when few studies are pooled, so commentators warn against fixed cut-offs and recommend reading it together with tau-squared and the absolute spread of effects.

Key figures

Julian Higgins
Simon Thompson
Gerta Rücker
Paul von Hippel
William Cochran

Seminal works

higgins-thompson-2002
higgins-2003

Frequently asked questions

What does an I-squared of 75% mean?: It indicates that about three-quarters of the total variation among study estimates reflects genuine between-study differences rather than sampling error; but because I-squared depends on study precision, it should be interpreted alongside the actual spread of effects, not against a fixed label.
Is high heterogeneity a reason not to pool studies?: Not automatically. High heterogeneity signals that studies differ and prompts investigation of why, but whether to pool, to use a random-effects model, or to refrain depends on whether the differences are explicable and the studies clinically comparable.