ScholarGate
Assistant

Probability and Probability Distributions

Probability is the mathematical language for quantifying uncertainty, and probability distributions describe how the possible values of a random variable are spread out. Together they form the theoretical foundation on which statistical inference in the health sciences is built: every confidence interval, p-value, and risk estimate ultimately rests on a probability model of how data could have arisen.

Definition

Probability assigns numbers between 0 and 1 to events to express how likely they are; a probability distribution is a function that specifies the probabilities of the possible values of a random variable.

Scope

This area orients the reader to the core ideas of probability and to the distributions most used in biostatistics. It covers the basic rules of probability, conditional probability and independence, the normal distribution, the binomial and Poisson distributions for counts and events, and the sampling distributions that link a sample to the population through the central limit theorem. It is a reference-educational overview of methodology, not clinical guidance.

Sub-topics

Core questions

  • How is uncertainty quantified so that data can be reasoned about formally?
  • What distribution describes a given type of measurement or count?
  • How does the behaviour of a sample statistic relate to the underlying population?
  • Why does the normal distribution arise so often in aggregate quantities?

Key concepts

  • Random variable
  • Sample space and events
  • Probability axioms
  • Conditional probability and independence
  • Discrete and continuous distributions
  • Expectation and variance
  • Sampling distribution
  • Central limit theorem

Mechanisms

A probability model specifies a sample space of possible outcomes and assigns probabilities consistent with the axioms (non-negativity, total probability one, additivity for mutually exclusive events). Random variables map outcomes to numbers, and their distributions summarise the probabilities of those numbers, characterised by quantities such as the mean (expectation) and variance. Discrete distributions such as the binomial and Poisson model counts of events; the continuous normal distribution models many measured quantities and, via the central limit theorem, approximates the distribution of sums and averages. Inferential statistics works by treating an observed statistic as a draw from its sampling distribution.

Clinical relevance

Probability distributions underpin the statistical methods used to summarise health data and to draw inferences from studies, so understanding them supports critical reading of the quantitative literature. This entry describes the methodological foundation of such analyses and is not a basis for individual diagnostic or treatment decisions.

History

Mathematical probability grew from seventeenth-century analyses of games of chance and was developed by Bernoulli, Laplace, Gauss, and Poisson into a general theory of distributions. Kolmogorov's axiomatic formulation in the 1930s placed probability on a rigorous footing. Through the twentieth century these tools became the basis of statistical inference, and biostatistics adopted them to model measurements and counts in medical and public-health research.

Key figures

  • Pierre-Simon Laplace
  • Carl Friedrich Gauss
  • Siméon Denis Poisson
  • Jacob Bernoulli
  • Andrey Kolmogorov

Related topics

Seminal works

  • altman-bland-1995-normal
  • rosner-2015
  • ross-2014

Frequently asked questions

Why do biostatistics courses spend so much time on probability distributions?
Because statistical inference works by comparing observed data to what a probability model predicts; the distribution is the bridge between a sample and a statement about the population, so the validity of confidence intervals and tests depends on choosing an appropriate distribution.
What is the difference between a probability and a probability distribution?
A probability is a single number describing how likely one event is, whereas a probability distribution specifies the probabilities across all possible values of a random variable at once.

Methods for this concept

Related concepts