Cohort Studies

Following groups forward over time

A cohort study is an observational design that follows groups defined by their exposure status over time to compare the incidence of an outcome. Prospective cohorts recruit participants and follow them forward; retrospective cohorts reconstruct follow-up from existing records. This design establishes the temporal ordering required for causal inference and can examine multiple outcomes, but it is costly, time-consuming, and vulnerable to loss to follow-up.

Defining the Concept

A cohort is a group of people defined by a shared characteristic or exposure. A cohort study follows this group over time to observe whether an outcome of interest — such as disease, death, or recovery — occurs. Researchers do not manipulate the exposure; they record events as they unfold under natural conditions. For this reason, the cohort study is classified as an observational design, distinct from randomized controlled trials. Its primary goal is to estimate the association between exposure and outcome, particularly incidence rates and relative risk.

How It Works: Main Types and Steps

There are two main types. In prospective cohorts, participants are enrolled at baseline, their exposures are documented, and outcomes are monitored in real time going forward. In retrospective cohorts, researchers reconstruct the cohort and its exposure history from pre-existing records such as hospital files or employment registers. Both types share the same core steps: define the cohort and comparison group, gather exposure information, follow participants for an adequate period, record outcome events, then compare incidence by exposure status.

A Concrete Example

The Framingham Heart Study is the most widely cited example of this design. Beginning in 1948, roughly 5,000 participants in Massachusetts were enrolled, examined every two years, and followed for decades to track cardiovascular outcomes. This allowed researchers to establish the association of risk factors such as smoking, hypertension, and elevated cholesterol with heart disease. Although researchers made no interventions, the long-term follow-up provided the temporal evidence necessary for causal reasoning. The study illustrates how cohort designs can shape public health policy and clinical practice.

Common Pitfalls and Good Practice

The most serious threat is loss to follow-up: if participants drop out for reasons related to the exposure or outcome, estimates become biased. Baseline differences between exposure groups — confounders — can also distort findings. Measurement error is a particular concern in retrospective designs that rely on record quality. A well-designed cohort study addresses these issues through careful participant selection, standardized exposure measurement, high follow-up rates, and rigorous confounding control. A priori power analysis and adequate sample size calculation are also essential components of the planning stage.

Key terms

Exposure: The factor or condition defining the group and examined for its association with the outcome.
Incidence: The rate at which new cases of an outcome occur among at-risk individuals over time.
Loss to Follow-Up: Missing observations from participants leaving the study, potentially introducing bias.
Relative Risk: Ratio of incidence in the exposed group to incidence in the unexposed group.
Confounder: A third variable associated with both exposure and outcome that can distort the true association.