The Law of Large Numbers

Why the sample mean converges to the truth

The law of large numbers states that as sample size grows, the sample mean converges to the true expected value of the population. It explains why larger samples yield more reliable estimates and why casinos and insurers profit predictably over many trials. The law concerns where the average settles; the central limit theorem, by contrast, describes the shape of its fluctuations around that settled value.

Definition and Core Logic

For n independent, identically distributed observations X_1, X_2, ..., X_n the sample mean is defined as X_bar = (X_1 + X_2 + ... + X_n) / n. The law of large numbers states that as n grows toward infinity, X_bar converges to mu, the true expected value of the distribution. In its weak form this convergence is in probability: for every epsilon > 0, P(|X_bar - mu| > epsilon) approaches zero. The strong form guarantees almost-sure convergence. Both forms require only that the expectation be finite; the distribution need not be normal.

How to Read and Interpret It

To interpret the law in practice, ask whether your estimate stabilizes as sample size grows. In a dice-rolling simulation the average fluctuates wildly in the first few throws but converges toward 3.5 after roughly 1000 trials. Shrinking confidence intervals and standard error that scales as 1 divided by the square root of n are direct expressions of the same principle. In research reports the law is usually conveyed indirectly: larger samples stabilize parameter estimates or increasing n reduces estimation variance are both valid formulations.

Common Misconceptions

The most frequent error is the Gambler Fallacy: the belief that prior losses make future wins more probable. The law of large numbers guarantees that long-run frequencies stabilize at a fixed value, not that individual trials compensate for each other. A second misconception is applying the law to small samples; it holds only as n grows large. A third error is conflating it with the central limit theorem. The law of large numbers answers where the mean goes; the central limit theorem answers what shape the distribution of that mean takes. These are complementary but distinct results.

Importance in Research and Reporting

This law is foundational to sampling theory, power analysis, and meta-analysis. Insurers set premiums based on expected loss rates; epidemiologists reliably estimate rare events in large cohorts. As a researcher, report that increasing sample size reduces estimation error, but note that a larger n does not eliminate biases or measurement error. With a sufficiently large but systematically biased sample, the mean still converges, but to the wrong value. Sample size is therefore a necessary condition for reliable inference, not a sufficient one on its own.

Sources

  1. Feller, W. (1968). An Introduction to Probability Theory and Its Applications, Vol. 1 (3rd ed.). John Wiley & Sons. ISBN: 978-0-471-25708-0