ScholarGate
Asistents

Contingency Tables and 2×2 Tables

A contingency table is a rectangular array of counts that cross-classifies a sample by two (or more) categorical variables, showing how many observations fall into each combination of categories. Its simplest and most important form in health research is the 2×2 table, which cross-tabulates a binary exposure against a binary outcome and is the starting point for nearly every measure and test of association.

Atrast tematu ar PaperMindDrīzumāFind papers & topics
Tools & resources
Lejupielādēt slaidus
Learn & explore
VideoDrīzumā

Definition

A contingency table is a cross-classification of a sample into a grid of cells whose entries are the frequencies of observations sharing a given combination of categories of two or more categorical variables; a 2×2 table is the special case with two binary variables and four cells.

Scope

This entry covers how counts are arranged into a contingency table, the anatomy and notation of the 2×2 (fourfold) table, the marginal and joint distributions it displays, the idea of independence between row and column variables, and the role of the table as the common substrate from which chi-squared tests, exact tests, and effect measures are computed. It treats the table as a methodological object, not as clinical guidance.

Core questions

  • How are two categorical variables cross-classified into cells of counts?
  • What are the marginal totals and the joint cell frequencies, and how do they relate under independence?
  • Why is the 2×2 table the canonical layout for a binary exposure and a binary outcome?
  • What expected counts would the cells contain if the row and column variables were independent?

Key concepts

  • Rows, columns, and cells
  • Marginal totals and grand total
  • Joint and conditional distributions
  • Independence and expected counts under independence
  • The 2×2 (fourfold) table layout a, b, c, d
  • Exposure-by-outcome cross-tabulation

Mechanisms

Each observation is placed in exactly one cell according to its combination of categories, so the table records the joint frequency distribution; summing across a row or column gives the marginal totals, and dividing cells by margins gives conditional distributions. Under the hypothesis that the two variables are independent, the expected count in a cell is the product of its row and column marginal totals divided by the grand total, and discrepancies between observed and expected counts are what association tests evaluate. In the 2×2 case the four cells are conventionally labelled a, b, c, d (exposed-case, exposed-noncase, unexposed-case, unexposed-noncase), and these four numbers directly yield the risk ratio, odds ratio, and chi-squared statistic. Larger r×c tables and multi-way tables extend the same logic, and stratifying a 2×2 table by a third variable produces the layered tables used in Mantel-Haenszel analysis.

Clinical relevance

The 2×2 table is the form in which diagnostic accuracy, treatment-effect, and risk-factor data are most often presented, so being able to read one — to identify the cells, the margins, and what is being compared — is basic to appraising health evidence. It is a way of organising and reading data and is not itself a basis for individual diagnostic or treatment decisions.

Epidemiology

Cohort, case-control, and cross-sectional studies, and randomized trials with binary endpoints, all condense at their core to a 2×2 table of an exposure or intervention against an outcome; diagnostic-test studies use a 2×2 of test result against true status. The table is therefore the shared computational starting point across study designs in epidemiology.

History

The term “contingency table” traces to Karl Pearson around 1900, and Fisher's 1922 paper clarified how such tables are analysed and the degrees of freedom involved. The fourfold (2×2) table became the workhorse of twentieth-century medical statistics, and reference texts by Fleiss and by Agresti codified its notation and the family of measures and tests built on it.

Key figures

  • Karl Pearson
  • Ronald A. Fisher
  • Joseph Fleiss
  • Alan Agresti

Related topics

Seminal works

  • fisher-1922
  • fleiss-2003
  • agresti-2013

Frequently asked questions

What is a 2×2 table?
It is the simplest contingency table: two rows and two columns cross-classifying a binary exposure (or intervention) against a binary outcome, giving four cells whose counts are used to compute risk ratios, odds ratios, and chi-squared tests.
What does “independence” mean in a contingency table?
Two variables are independent when the distribution of one does not depend on the other; under independence the expected count in each cell equals its row total times its column total divided by the grand total, and association tests measure departures from this.

Methods for this concept

Related concepts