Validity and Reliability
The two core dimensions of measurement quality
Reliability refers to the consistency or repeatability of a measurement across time points, raters, or items. Validity concerns whether an instrument actually measures what it claims to measure. A measure can be reliable yet invalid — consistently off-target — while validity presupposes at least some degree of reliability. Together these two properties form the foundational quality criteria of scientific measurement.
Core Concepts: Definition and Distinction
Reliability is the degree to which a measurement instrument produces consistent results under the same conditions. Its main forms include test-retest reliability, inter-rater reliability, and internal consistency (e.g., Cronbach's alpha). Validity concerns whether the instrument correctly captures the intended theoretical construct. The relationship between them is asymmetric: high reliability does not guarantee validity, but validity presupposes at least a minimum level of reliability. Measurement error has two components — systematic error (bias) and random error. Reliability addresses only random error; only validity evaluation reveals systematic error.
Types of Validity and the Cronbach-Meehl Framework
Cronbach and Meehl (1955) placed construct validity at the center of psychological measurement. Content validity asks whether the instrument covers all relevant dimensions of the domain. Criterion validity examines the extent to which scores correspond to an external criterion — either concurrently or predictively. Construct validity is the broadest concept: it tests how well a theoretical construct aligns with empirical indicators and subsumes convergent and discriminant evidence. Modern measurement theory treats these types not as mutually exclusive categories but as complementary facets of a unified validity argument.
Common Misconceptions
The most frequent error is assuming that a high Cronbach's alpha coefficient establishes validity. Alpha measures only internal consistency and is blind to systematic bias. Highly intercorrelated items that all tap the same narrow topic can yield high alpha without providing evidence of dimensionality or construct validity. A related misconception is that reliability implies validity: a compass that consistently points in the wrong direction is reliable but not valid. Finally, validity is not a one-time verdict — it is an ongoing evidence-building process that must be reassessed whenever the context, sample, or intended use changes.
Importance in Research Practice
Reliability and validity must be considered at every stage of research, from instrument development through data analysis. Measurement error directly affects statistical power calculations: effect sizes cannot be properly corrected without knowing the reliability coefficient (r_xx) of the instrument. Advanced techniques such as item response theory and structural equation modeling explicitly incorporate both properties within the analytical model. Researchers are expected to present evidence for both dimensions and to report the limitations of their instruments transparently in order to support defensible claims about the generalizability of findings.
Key thinkers
- Lee J. Cronbach (1916–2001)Stanford University psychologist who developed the alpha coefficient and, together with Meehl, introduced the concept of construct validity to psychological measurement.
- Paul E. Meehl (1920–2003)University of Minnesota psychologist renowned for his contributions to clinical and philosophical psychology; his 1955 paper co-authored with Cronbach is considered a landmark in measurement theory.
Sources
- Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. DOI: 10.1037/h0040957 ↗