How is selection bias different from confounding?

Confounding comes from a pre-existing common cause of exposure and outcome, whereas selection bias is produced by the selection process — typically by conditioning on a common effect (collider) of the two — and cannot generally be fixed by adjusting for measured confounders.

What is collider bias?

It is selection bias created when the analysis conditions on a variable that is a common effect of the exposure and the outcome; doing so opens a non-causal association between them within the selected sample.

Does selection bias affect internal or external validity?

It is primarily a threat to internal validity, because it distorts the association within the studied sample relative to the source population, though it also bears on generalisability.

Selection Bias

Selection bias is a systematic error introduced by the way subjects are chosen for a study or are retained in the analysis, when that selection depends on both the exposure and the outcome. Because the analysed sample is then unrepresentative of the relationship in the source population, the estimated association can be distorted even if exposure and outcome are measured perfectly.

Definition

Selection bias is a distortion of the exposure-outcome association that arises when the probability of being included in (or remaining in) the study population is influenced jointly by the exposure and the outcome, so that the association in the analysed sample differs from that in the source population.

Scope

The entry covers the structural definition of selection bias as conditioning on a common effect (collider), its common forms in epidemiologic studies — such as control-selection bias, loss to follow-up, and self-selection — and how it differs from confounding. It is a methodological reference and gives no clinical advice.

Core questions

Does entry into or retention in the study depend on both exposure and outcome?
Is the analysis conditioning on a common effect (collider) of exposure and outcome?
How do losses to follow-up or non-response relate to exposure and outcome?
Can the direction and magnitude of the resulting bias be reasoned about?

Key concepts

Collider and collider-stratification bias
Conditioning on a common effect
Control-selection bias
Loss to follow-up / attrition
Self-selection and non-response
Berksonian (hospital-admission) bias
Healthy-worker effect

Mechanisms

In structural terms, selection bias occurs when the analysis conditions on a variable that is a common effect (a collider) of the exposure and the outcome, or of causes of each. Conditioning on a collider opens a non-causal association between exposure and outcome within the selected sample, even when none exists in the source population. This single mechanism unifies many named biases: choosing controls whose selection depends on exposure (control-selection bias), differential loss to follow-up related to both exposure and prognosis, hospital-based comparisons where admission depends on multiple conditions (Berksonian bias), and the healthy-worker effect. Selection bias is distinct from confounding: confounding is a pre-existing common cause, whereas selection bias is created by the selection process itself and is not necessarily removed by adjusting for measured covariates.

Clinical relevance

Selection bias is a major reason an observational estimate may fail to generalise even to the population it was meant to describe, so appraising how subjects were selected and retained is central to weighing the evidence. The concept explains how study results can mislead; it is not guidance for any individual's care.

Epidemiology

Selection bias is a concern across study designs but takes characteristic forms in each: control selection in case-control studies, attrition in cohort studies, and non-response in surveys. The structural (causal-diagram) account from the 2000s gave a unified way to anticipate it at the design stage and to reason about its likely direction.

History

Individual selection biases — Berkson's hospital-admission paradox, the healthy-worker effect, control-selection problems — were described through the twentieth century as separate phenomena. Hernán, Hernández-Díaz, and Robins (2004) recast them under a single structural definition based on conditioning on a common effect, and Cole and colleagues (2009) gave an accessible numerical illustration of collider bias, clarifying how selection creates non-causal associations.

Debates

When does adjusting for a variable cause rather than cure bias?: Because conditioning on a collider induces selection bias, adjusting for or selecting on a common effect of exposure and outcome (or their causes) can create a spurious association where none existed, which is why the causal structure must guide what is conditioned on.

Key figures

Miguel Hernán
James Robins
Sander Greenland
Joseph Berkson

Seminal works

hernan-2004
cole-2009
delgado-rodriguez-2004

Frequently asked questions

How is selection bias different from confounding?: Confounding comes from a pre-existing common cause of exposure and outcome, whereas selection bias is produced by the selection process — typically by conditioning on a common effect (collider) of the two — and cannot generally be fixed by adjusting for measured confounders.
What is collider bias?: It is selection bias created when the analysis conditions on a variable that is a common effect of the exposure and the outcome; doing so opens a non-causal association between them within the selected sample.
Does selection bias affect internal or external validity?: It is primarily a threat to internal validity, because it distorts the association within the studied sample relative to the source population, though it also bears on generalisability.