The Research Data Lifecycle

From planning data to reusing it

The Research Data Lifecycle frames data management as stages spanning a project and beyond: planning, collecting or creating, processing, analysing, preserving, sharing or publishing, and reusing data. Mapping these stages helps researchers write data management plans, choose appropriate formats and metadata, meet funder and FAIR requirements, and ensure that data remain usable and citable long after the study ends.

What the Framework Is and Why It Matters

The Research Data Lifecycle maps all the stages data passes through — from raw observation to long-term reuse — within a single cyclical model. The framework helps researchers view data not only as an analytical tool but as an output that must be preserved and shared. Because funders and journals increasingly require data management plans and open data, understanding the lifecycle has become a core part of academic accountability. The framework also aligns directly with the FAIR principles: Findable, Accessible, Interoperable, and Reusable.

The Phases in Order

The lifecycle consists of seven sequential phases. In the planning phase researchers decide what data to collect, how it will be stored, and with whom it will be shared. In the collecting or creating phase data are gathered through surveys, experiments, or archival sources. In the processing phase raw data are cleaned, coded, and prepared for analysis. In the analysis phase statistical or qualitative methods are applied to produce findings. In the preserving phase data are archived in appropriate formats. In the sharing or publishing phase data are deposited in an institutional or disciplinary repository. Finally, in the reusing phase other researchers access the data to replicate results or address new questions.

How It Is Applied in Practice

Researchers most commonly apply the lifecycle when writing a data management plan (DMP) or preparing a funding application. Concrete questions are posed at each phase: Which metadata standard will be used? In which file format will data be saved, and is that format open? How will sensitive data be anonymised? How many years will data be retained after the project ends? Institutions often provide DMP templates, while repositories such as Zenodo, Figshare, and PANGAEA offer infrastructure for archiving and assigning DOIs. The lifecycle forces researchers to think through these decisions in a structured sequence.

Common Pitfalls and Misconceptions

The most common pitfall is treating the lifecycle as only an archiving step; if planning is not done at the outset, problems in later phases are often irreversible. Another misconception is assuming that saving raw files is sufficient; undocumented variables and missing metadata can render data nearly unusable five years later. Researchers sometimes conflate preservation with publishing, but preservation concerns long-term technical accessibility while publishing concerns community discovery and citation. Finally, the reuse phase is frequently overlooked, even though this is where the value of data for other projects becomes apparent.

Key terms

Data Management Plan: A formal document outlining how data will be handled at every stage from collection through reuse.
FAIR Principles: Four principles requiring research data to be Findable, Accessible, Interoperable, and Reusable.
Metadata: Descriptive information about data that explains its context, content, and structure to enable reuse by others.
Data Repository: A digital platform used for long-term storage, access, and DOI assignment of research datasets.
Data Reuse: Using previously collected data to answer new research questions or to verify existing findings.