ScholarGate
Assistent

Reproducible Research

Reproducible research is the practice of conducting and publishing statistical analyses so that others, given the same data and code, can regenerate the reported results exactly.

Thema finden mit PaperMindDemnächstFind papers & topics
Tools & resources
Folien herunterladen
Learn & explore
VideoDemnächst

Definition

Reproducible research is a set of practices ensuring that the computational results of a statistical analysis can be regenerated from the original data and code, by binding together data, analysis code, computing environment and narrative.

Scope

This topic covers literate programming that weaves code, results and narrative together, the dynamic documents and notebooks that implement it, version control and environment capture, data and code sharing under principles such as FAIR, and the distinction between reproducibility and the harder goal of replicability. The emphasis is on computational reproducibility of an analysis.

Core questions

  • What does it mean for a computational analysis to be reproducible?
  • How do literate programming and dynamic documents bind code to results?
  • How do version control and environment capture preserve an analysis?
  • How do data-sharing principles such as FAIR support reuse and verification?

Key concepts

  • Literate programming
  • Dynamic documents
  • Version control
  • Environment capture
  • FAIR data principles
  • Reproducibility versus replicability

Key theories

Literate programming and dynamic documents
Interleaving analysis code with explanatory text and regenerating figures and tables directly from that code, as in literate programming and modern notebooks, ensures that reported results always match the computations that produced them.
Findable, accessible data and environments
Sharing data and code under principles such as FAIR, together with captured computing environments and version history, lets others locate, run and verify an analysis rather than merely read its conclusions.

Clinical relevance

Reproducible workflows let collaborators, reviewers and regulators verify statistical results, catch errors, and build on prior work; amid concern over a reproducibility crisis across the sciences, these practices are a practical safeguard for the credibility of data analyses.

History

Claerbout pioneered reproducible computational documents in geophysics, Knuth's literate programming supplied the underlying idea, and statisticians such as Gentleman formalized reproducible analysis; dynamic-document tools and the FAIR principles later made these practices mainstream.

Debates

Reproducibility versus replicability
Regenerating the same results from the same data and code (reproducibility) is distinct from obtaining consistent findings in a new study (replicability); there is ongoing discussion about terminology and about how much each guarantees scientific validity.

Key figures

  • Donald Knuth
  • Robert Gentleman
  • Duncan Temple Lang
  • Jon Claerbout

Related topics

Seminal works

  • knuth1984
  • gentleman2007

Frequently asked questions

Is reproducibility the same as getting the same scientific conclusion in a new experiment?
No. Reproducibility means regenerating the same results from the same data and code. Obtaining a consistent finding in a fresh study with new data is replicability, a separate and generally harder standard.
What tools support reproducible research?
Dynamic-document systems and notebooks that run code to produce figures and tables, version control to track changes, and environment-capture tools that record software versions together make an analysis reproducible by others.

Methods for this concept

Related concepts