What is the difference between internal and external validity?

Internal validity is whether a study's result is free of bias and reflects a true effect in its own sample; external validity is whether that result applies to other patients, settings, or questions.

Does a higher place in the evidence hierarchy guarantee a better answer?

No. Study design sets the potential strength of evidence, but a poorly conducted trial or review can still be biased, which is why structured risk-of-bias appraisal and certainty grading are applied to every study.

Evidence Evaluation and Critical Appraisal

Evidence evaluation and critical appraisal is the disciplined judgement of whether a study or body of evidence is valid, what its results mean, and whether they apply to a given question. It is the core analytic skill of evidence-based medicine, separating the reliability of evidence from the loudness of its claims.

Find emne med PaperMindSnartFind papers & topics

Tools & resources

Hent slides

Learn & explore

VideoSnart

Definition

Critical appraisal is the systematic process of examining research to judge its internal validity (freedom from bias), the size and precision of its results, and its external validity (applicability), so the trustworthiness of the evidence can be established before it is used.

Scope

This topic covers the hierarchy of evidence, the structured assessment of risk of bias in individual studies, the grading of the certainty of a body of evidence, and the judgement of applicability. It is a methodological and reference topic about how evidence is judged, not a source of treatment instructions.

Core questions

Is the study's design and conduct free of important bias?
What is the size and precision of the reported effect?
How certain is the overall body of evidence?
Do the results apply to the patients or question at hand?

Key concepts

Internal validity and risk of bias
External validity and applicability
Hierarchy of evidence
Certainty (quality) of evidence
Effect size and precision
Structured appraisal tools (RoB 2, AMSTAR 2)

Mechanisms

Appraisal proceeds from the individual study to the body of evidence. For a randomised trial, structured tools such as RoB 2 examine domains where bias can enter — randomisation, deviations from intended interventions, missing data, measurement, and selective reporting. For a systematic review, AMSTAR 2 assesses methodological quality. Across studies, the GRADE framework rates the certainty of a body of evidence as high, moderate, low, or very low, lowering it for risk of bias, inconsistency, indirectness, imprecision, and publication bias, and raising it for features such as large effects. This certainty rating then feeds the move from evidence to recommendation. Underlying all of this is the evidence-based medicine principle that external evidence must be appraised before it is integrated with clinical expertise.

Clinical relevance

Critical appraisal determines how much weight a piece of evidence should carry in formulary decisions, guideline development, and the answering of drug information questions. It is a reference skill for weighing the medicines literature and describes how evidence is judged; it does not itself direct individual diagnosis or therapy.

Evidence & guidelines

Several widely adopted instruments standardise appraisal: the Cochrane RoB 2 tool for risk of bias in randomised trials, AMSTAR 2 for the methodological quality of systematic reviews, and the GRADE framework for rating the certainty of a body of evidence and the strength of recommendations. These tools are maintained by their developer groups and updated as methods evolve.

History

Critical appraisal was formalised by the clinical epidemiology and evidence-based medicine movements of the 1980s and 1990s, with Sackett and colleagues articulating its principles. Structured instruments followed: the Cochrane risk-of-bias tools (revised as RoB 2), AMSTAR for systematic-review quality (revised as AMSTAR 2), and the GRADE approach to grading certainty, which together replaced informal judgement with explicit, reproducible criteria.

Key figures

David Sackett
Gordon Guyatt
Jonathan Sterne
Beverley Shea

Seminal works

sackett-1996
guyatt-2008-grade
sterne-2019-rob2
shea-2017-amstar2

Frequently asked questions

What is the difference between internal and external validity?: Internal validity is whether a study's result is free of bias and reflects a true effect in its own sample; external validity is whether that result applies to other patients, settings, or questions.
Does a higher place in the evidence hierarchy guarantee a better answer?: No. Study design sets the potential strength of evidence, but a poorly conducted trial or review can still be biased, which is why structured risk-of-bias appraisal and certainty grading are applied to every study.