Is reproducibility the same as getting the same scientific conclusion in a new experiment?

No. Reproducibility means regenerating the same results from the same data and code. Obtaining a consistent finding in a fresh study with new data is replicability, a separate and generally harder standard.

What tools support reproducible research?

Dynamic-document systems and notebooks that run code to produce figures and tables, version control to track changes, and environment-capture tools that record software versions together make an analysis reproducible by others.

Reproducible Research

Reproducible research is the practice of conducting and publishing statistical analyses so that others, given the same data and code, can regenerate the reported results exactly.

Thema finden mit PaperMindDemnächstFind papers & topics

Tools & resources

Folien herunterladen

Learn & explore

VideoDemnächst

Definition

Reproducible research is a set of practices ensuring that the computational results of a statistical analysis can be regenerated from the original data and code, by binding together data, analysis code, computing environment and narrative.

Scope

This topic covers literate programming that weaves code, results and narrative together, the dynamic documents and notebooks that implement it, version control and environment capture, data and code sharing under principles such as FAIR, and the distinction between reproducibility and the harder goal of replicability. The emphasis is on computational reproducibility of an analysis.

Core questions

What does it mean for a computational analysis to be reproducible?
How do literate programming and dynamic documents bind code to results?
How do version control and environment capture preserve an analysis?
How do data-sharing principles such as FAIR support reuse and verification?

Key concepts

Literate programming
Dynamic documents
Version control
Environment capture
FAIR data principles
Reproducibility versus replicability

Key theories

Literate programming and dynamic documents: Interleaving analysis code with explanatory text and regenerating figures and tables directly from that code, as in literate programming and modern notebooks, ensures that reported results always match the computations that produced them.
Findable, accessible data and environments: Sharing data and code under principles such as FAIR, together with captured computing environments and version history, lets others locate, run and verify an analysis rather than merely read its conclusions.

Clinical relevance

Reproducible workflows let collaborators, reviewers and regulators verify statistical results, catch errors, and build on prior work; amid concern over a reproducibility crisis across the sciences, these practices are a practical safeguard for the credibility of data analyses.

History

Claerbout pioneered reproducible computational documents in geophysics, Knuth's literate programming supplied the underlying idea, and statisticians such as Gentleman formalized reproducible analysis; dynamic-document tools and the FAIR principles later made these practices mainstream.

Debates

Reproducibility versus replicability: Regenerating the same results from the same data and code (reproducibility) is distinct from obtaining consistent findings in a new study (replicability); there is ongoing discussion about terminology and about how much each guarantees scientific validity.

Key figures

Donald Knuth
Robert Gentleman
Duncan Temple Lang
Jon Claerbout

Seminal works

knuth1984
gentleman2007

Frequently asked questions

Is reproducibility the same as getting the same scientific conclusion in a new experiment?: No. Reproducibility means regenerating the same results from the same data and code. Obtaining a consistent finding in a fresh study with new data is replicability, a separate and generally harder standard.
What tools support reproducible research?: Dynamic-document systems and notebooks that run code to produce figures and tables, version control to track changes, and environment-capture tools that record software versions together make an analysis reproducible by others.