What do GRADE's four certainty levels mean?

High, moderate, low, and very low describe how confident one can be that the estimated effect is close to the true effect for a given outcome; the rating reflects risk of bias, consistency, directness, precision, and publication bias across the body of evidence.

Can low-certainty evidence support a strong recommendation?

Yes. In GRADE, recommendation strength is a separate judgement that also weighs the balance of benefits and harms, values, and resources, so a strong recommendation can rest on low-certainty evidence and a conditional one on high-certainty evidence.

GRADE (Grading of Recommendations, Assessment, Development, and Evaluation)

GRADE is a structured, transparent system for rating the certainty of a body of evidence and the strength of the recommendations that follow from it. Rather than ranking by study design alone, it begins from the design but then rates certainty up or down for each important outcome and separates that certainty from how strongly a recommendation is made, so that guidance and the confidence behind it are made explicit.

Definition

GRADE is a framework that rates the certainty (quality) of evidence for each important outcome as high, moderate, low, or very low, and separately rates the strength of a recommendation as strong or conditional, based on the balance of benefits and harms, values, and resource use.

Scope

The entry covers GRADE's two core judgements—certainty of evidence and strength of recommendation—the factors that lower or raise certainty, the four certainty categories, and the evidence-profile and summary-of-findings tables used to present them. It is a methodological reference on a grading framework, not clinical guidance.

Key concepts

Certainty (quality) of evidence: high, moderate, low, very low
Five factors that lower certainty: risk of bias, inconsistency, indirectness, imprecision, publication bias
Three factors that can raise certainty in observational evidence: large effect, dose-response, plausible confounding working against the effect
Strength of recommendation: strong vs. conditional
Outcome-specific (not study-specific) rating
Evidence profile and summary-of-findings table
Separation of certainty from recommendation strength

Mechanisms

GRADE starts each outcome at a baseline determined by design—randomised trials begin as high certainty, observational studies as low—then rates down for risk of bias, inconsistency across studies, indirectness, imprecision, and publication bias, and can rate observational evidence up for a large effect, a dose-response gradient, or confounding likely to have worked against the observed effect. The result is a certainty rating per important outcome, summarised in an evidence profile. Recommendation strength is a separate judgement weighing the balance of desirable and undesirable effects, the certainty of evidence, patients' values and preferences, and resource use; high certainty does not automatically yield a strong recommendation, nor low certainty a weak one.

Clinical relevance

Guideline panels and health technology assessment bodies use GRADE to communicate, in a standard vocabulary, how confident readers should be in each conclusion and how strongly an action is recommended. It explains why a recommendation is labelled strong or conditional and what certainty supports it; the framework describes how evidence and recommendations are graded and is not a source of individualised treatment advice.

Evidence & guidelines

GRADE was introduced by the GRADE Working Group (Atkins et al., 2004) and reached broad consensus with Guyatt et al. (2008). The Journal of Clinical Epidemiology GRADE guidelines series then operationalised it, including the introduction and evidence profiles (Guyatt et al., 2011), question framing and outcome selection (Guyatt et al., 2011), and rating the quality of evidence (Balshem et al., 2011). It has been adopted by many guideline developers and systematic-review organisations worldwide.

History

GRADE arose in the early 2000s from dissatisfaction with the many incompatible grading systems then in use. The GRADE Working Group, an international collaboration, published its rationale in 2004 and an emerging consensus statement in 2008. From 2011 the group issued a detailed guidelines series in the Journal of Clinical Epidemiology that codified evidence profiles, the factors that move certainty up or down, and the logic linking certainty to recommendation strength.

Debates

Does separating certainty from recommendation strength help or confuse users?: GRADE deliberately decouples how certain the evidence is from how strongly an action is recommended, since low-certainty evidence can still justify a strong recommendation and vice versa; some find this clarifying, others find the distinction hard to apply consistently.

Key figures

Gordon Guyatt
Andrew Oxman
Holger Schunemann
Howard Balshem
David Atkins

Seminal works

grade-working-group-2004
guyatt-2008-grade
balshem-2011

Frequently asked questions

What do GRADE's four certainty levels mean?: High, moderate, low, and very low describe how confident one can be that the estimated effect is close to the true effect for a given outcome; the rating reflects risk of bias, consistency, directness, precision, and publication bias across the body of evidence.
Can low-certainty evidence support a strong recommendation?: Yes. In GRADE, recommendation strength is a separate judgement that also weighs the balance of benefits and harms, values, and resources, so a strong recommendation can rest on low-certainty evidence and a conditional one on high-certainty evidence.