ScholarGate
Βοηθός

Sampling Distributions and Central Limit Theorem

A sampling distribution is the probability distribution of a statistic, such as a sample mean, across all possible samples of a given size. The central limit theorem states that, for large enough samples, the sampling distribution of the mean is approximately normal regardless of the shape of the underlying data. Together they explain why normal-based confidence intervals and tests work so widely.

Εύρεση θέματος με το PaperMindΣύντομαFind papers & topics
Tools & resources
Λήψη διαφανειών
Learn & explore
ΒίντεοΣύντομα

Definition

A sampling distribution is the distribution of values a statistic would take over all possible samples of a fixed size from a population; the central limit theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size increases, whatever the shape of the population.

Scope

The entry covers the concept of a sampling distribution, the standard error as its spread, the central limit theorem and the role of sample size, and the distinction between the standard deviation of individuals and the standard error of a statistic. It links these ideas to confidence intervals and hypothesis testing. It is a methodological reference and not clinical guidance.

Core questions

  • What is the sampling distribution of a statistic and why does it matter?
  • How does the standard error differ from the standard deviation?
  • What does the central limit theorem guarantee, and under what conditions?
  • How does sample size affect the precision of an estimate?

Key concepts

  • Statistic versus parameter
  • Sampling distribution
  • Standard error
  • Standard error versus standard deviation
  • Sample size and precision
  • Approximate normality of the mean
  • Basis of confidence intervals and tests

Key theories

Central limit theorem
For independent observations from a population with finite variance, the distribution of the sample mean tends toward a normal distribution as the sample size grows, irrespective of the population's shape; this justifies normal-based inference for means even when individual measurements are non-normal.

Mechanisms

If repeated samples of the same size were drawn from a population, a statistic such as the mean would vary from sample to sample; the distribution of those values is the sampling distribution, and its standard deviation is the standard error. For a sample mean, the standard error equals the population standard deviation divided by the square root of the sample size, so precision improves as samples grow but only with the square root of n. The central limit theorem adds that, for sufficiently large samples, this sampling distribution is approximately normal even when the data themselves are skewed, provided the observations are independent and the variance is finite. This is the engine of classical inference: a confidence interval for a mean is built by stepping out a number of standard errors from the estimate under approximate normality, and many hypothesis tests compare an estimate to its sampling distribution. The standard error, which shrinks with sample size, must be distinguished from the standard deviation of the individual observations, which estimates population spread and does not shrink.

Clinical relevance

Confidence intervals and p-values reported in clinical and public-health studies rest on the sampling distribution of the estimate and the central limit theorem, so understanding them aids judging the precision of reported effects. This entry is methodological background and not a basis for individual clinical decisions.

History

Early forms of the central limit theorem appeared in de Moivre's normal approximation to the binomial and in Laplace's work around 1810, and rigorous general conditions were established by Lyapunov and others around 1900. The sampling-distribution viewpoint became central to inference in the early twentieth century and remains the standard justification for normal-based confidence intervals and tests in biostatistics.

Debates

How large must a sample be for the central limit theorem to apply?
The approximation improves with sample size, but how large is large enough depends on how skewed the data are; for markedly skewed distributions much larger samples are needed before the mean's distribution is acceptably normal, so no single rule of thumb fits all cases.

Key figures

  • Pierre-Simon Laplace
  • Abraham de Moivre
  • Aleksandr Lyapunov

Related topics

Seminal works

  • altman-bland-2005-se
  • rosner-2015

Frequently asked questions

What is the difference between a standard deviation and a standard error?
A standard deviation measures the spread of individual observations, whereas a standard error measures the spread of a statistic, such as a sample mean, across samples; the standard error decreases as the sample size grows, while the standard deviation estimates a fixed population quantity.
Why can we use the normal distribution for a mean even when data are skewed?
The central limit theorem says the sampling distribution of the mean becomes approximately normal as the sample size increases regardless of the data's shape, so normal-based methods for the mean are often valid with large enough samples even when individual values are not normally distributed.

Methods for this concept

Related concepts