The Potential Outcomes Framework
Causal effect as a counterfactual
The potential outcomes framework defines a causal effect as the difference between what would happen to a unit under treatment versus under control. Only one of these two potential outcomes is ever observed — a constraint known as the fundamental problem of causal inference. Estimands such as the average treatment effect and the average treatment effect on the treated make causal questions precise, and identification rests on assumptions such as ignorability.
Defining the Concept
The potential outcomes framework, also called the Rubin causal model, grounds causality in counterfactual logic. For each unit i, two potential outcomes are defined: Y_i(1) is the outcome under treatment and Y_i(0) is the outcome under control. The unit-level causal effect is the difference between these two values. However, a unit cannot be simultaneously treated and untreated, so only one potential outcome is ever observed per unit. This constraint is the fundamental problem of causal inference and makes statistical inference unavoidable.
How It Works: Estimands and Identification
Because unit-level effects are unobservable, researchers focus on population-level estimands. The average treatment effect (ATE) is the expected value of the difference in potential outcomes across all units. The average treatment effect on the treated (ATT) restricts this average to the treated group only. Identification of these estimands from observational data requires assumptions such as strong ignorability, which states that treatment assignment is conditionally independent of the potential outcomes. Random assignment satisfies this assumption by design.
A Concrete Example
Consider estimating the effect of a scholarship program on student graduation. For each student, Y_i(1) represents whether the student would graduate if awarded the scholarship, and Y_i(0) represents whether the student would graduate without it. For scholarship recipients, Y_i(0) is unobserved; for non-recipients, Y_i(1) is unobserved. In a randomized study, treated and control groups are on average equivalent in background conditions, so the observed mean difference provides an unbiased estimate of ATE. Observational studies require additional methods such as matching or regression adjustment.
Common Pitfalls and Good Practice
The most common mistake is conflating correlation with causation: an observed association carries no causal interpretation unless the potential outcomes condition is met. The stable unit treatment value assumption (SUTVA) is also frequently overlooked: one unit's potential outcomes must not be affected by another unit's treatment status. Researchers should also explicitly state the estimand before data collection and not conflate ATE with ATT. Good practice requires justifying identifying assumptions and reporting sensitivity analyses that examine how results change when those assumptions are relaxed.
Key terms
- Potential Outcome
- The hypothetical outcome a unit would exhibit under a specific treatment condition.
- Fundamental Problem of Causal Inference
- Only one potential outcome per unit is observable; the other remains missing by design.
- Average Treatment Effect (ATE)
- The population mean of the difference in potential outcomes across all units.
- Ignorability
- Assumption that treatment assignment is independent of potential outcomes given covariates.
- SUTVA
- Assumption that one unit's potential outcomes are unaffected by other units' treatment status.
Further reading
- Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688-701. DOI: 10.1037/h0037350 ↗