Health Data Governance and Data Quality
Health data governance is the set of policies, roles, and accountabilities that determine who may access health data, for what purposes, and under what controls; data quality is the degree to which those data are fit for their intended use. Together they decide whether analytics built on routinely collected health data can be trusted.
Definition
Health data governance is the framework of accountability, policy, and control over the management and use of health data, while data quality assessment is the systematic evaluation of whether those data are complete, correct, and plausible enough for a given analytic purpose.
Scope
This topic covers the stewardship structures that govern health data and the dimensions and methods used to assess their quality, including completeness, correctness, and plausibility. It addresses why secondary use of clinical data requires explicit governance and quality assessment. It is a reference treatment of methods and principles, not legal, regulatory, or compliance advice for any jurisdiction.
Key concepts
- Data stewardship and accountability
- Data quality dimensions (completeness, correctness, plausibility)
- Fitness for use
- Secondary use of clinical data
- Harmonized data quality terminology
- FAIR principles (findable, accessible, interoperable, reusable)
- Provenance and data lineage
- Access control and data-use agreements
Mechanisms
Because clinical and administrative data are collected for care and billing rather than for research, their reuse requires both governance and quality control. Governance assigns stewardship: defined roles decide access, permissible uses, and safeguards, and document data-use agreements. Quality assessment then evaluates the data against task-relevant dimensions. Reviews of electronic health record data quality have organized these into recurring dimensions such as completeness, correctness, concordance, plausibility, and currency, and later harmonization work proposed a shared terminology so that institutions describe quality consistently. Stewardship principles such as FAIR emphasize that data should be findable, accessible, interoperable, and reusable, which complements quality assessment by addressing how data are organized and shared.
Clinical relevance
Governance and quality determine whether evidence derived from routinely collected data is dependable; poor data quality can bias risk-prediction models and quality measures that influence care decisions, as systematic reviews of prediction modeling have noted. Understanding these methods helps users gauge the trustworthiness of data-derived findings. This topic describes principles of stewardship and assessment and does not constitute regulatory, privacy, or compliance guidance.
History
As secondary use of clinical data grew, the field recognized that uncontrolled, unassessed data could mislead. During the 2010s, systematic reviews catalogued the dimensions of electronic health record data quality, harmonization efforts proposed shared terminology and frameworks for assessing fitness for use, and the FAIR principles articulated broader stewardship expectations for research data. These developments established governance and quality as prerequisites for credible health analytics.
Debates
- Is data quality an intrinsic property or relative to use?
- A recurring tension is whether data quality can be judged absolutely or only against a specific analytic purpose; the prevailing view frames quality as fitness for use, meaning data adequate for one analysis may be inadequate for another, which complicates universal quality standards.
Key figures
- Nicole Weiskopf
- Chunhua Weng
- Michael Kahn
- Mark Wilkinson
Related topics
Seminal works
- weiskopf-weng-2013
- kahn-2016
- wilkinson-2016
Frequently asked questions
- What are the common dimensions of health data quality?
- Reviews of electronic health record data quality typically describe dimensions such as completeness, correctness or accuracy, concordance, plausibility, and currency. The relevant dimensions depend on the intended analytic use of the data.
- How does data governance differ from data quality?
- Governance is about authority and accountability: who controls the data and how its use is permitted and safeguarded. Data quality is about the data's fitness for use. Good governance creates the conditions under which quality can be maintained and assessed.