Sampling Methods

Probability and non-probability sampling

Sampling is the process of selecting a subset of a population to collect data and draw conclusions about the whole. Probability sampling methods — simple random, systematic, stratified, cluster, and multistage — assign every unit a known, nonzero selection probability, enabling unbiased estimation and statistical generalization. Non-probability methods — convenience, purposive, quota, and snowball sampling — are cheaper and sometimes necessary, but are prone to selection bias and limit the validity of statistical inference.

Core Concepts: Sampling Frame and Representativeness

At the heart of any sampling process lie two concepts: the sampling frame and representativeness. The sampling frame is the list or database that enumerates every unit in the target population; if this list is incomplete or outdated, coverage error is unavoidable. Representativeness refers to how well the chosen sample mirrors the characteristics of the target population. While increasing sample size reduces standard error (SE = σ/√n), a larger sample alone does not guarantee representativeness — the choice of sampling method is at least as important as sample size.

Probability Sampling: Types and How They Work

In probability sampling, every unit in the population has a known, nonzero chance of being selected. Simple random sampling assigns equal probability to all units; systematic sampling selects at fixed intervals (k = N/n) from an ordered list. Stratified sampling divides the population into homogeneous subgroups (strata) and samples each stratum separately, improving precision for subgroups of interest. Cluster sampling first divides the population into natural groups (clusters), then randomly selects entire clusters. Multistage sampling combines these techniques in successive stages, which is common in large-scale surveys.

Non-Probability Sampling: Uses and Limitations

In non-probability sampling, selection probabilities are unknown. Convenience sampling captures the most accessible individuals; purposive (judgmental) sampling relies on the researcher's judgment to select information-rich cases; quota sampling fills predetermined category targets; snowball sampling grows through participant referrals and is useful for hard-to-reach populations. These methods are cost-effective and sometimes the only option, but they carry a high risk of selection bias. Formal statistical inference — hypothesis tests, confidence intervals — is not technically justified from non-probability samples, and findings cannot be generalized to the broader population.

Sampling in Research Practice: Design Effects and Common Mistakes

Sampling design directly affects standard errors. Stratified sampling typically yields smaller standard errors than simple random sampling, whereas cluster sampling introduces a design effect (DEFF = 1 + ρ(m−1), where ρ is the intraclass correlation and m is the cluster size) that inflates standard errors. A common misconception is that large convenience samples confer the advantages of probability sampling — they do not. Another frequent mistake involves low survey response rates; if non-respondents differ systematically from respondents, nonresponse bias results. Rigorous research requires transparent reporting of the sampling method, the frame used, and any resulting limitations.

Key thinkers

William G. Cochran (1909–1980)Scottish-American statistician whose landmark textbook 'Sampling Techniques' systematized modern sampling theory and remains the field's primary foundational reference.

Sources

Cochran, W. G. (1977). Sampling Techniques (3rd ed.). John Wiley & Sons. ISBN: 978-0-471-16240-7