How to Conduct a Hypothesis Test
The step-by-step testing procedure
Hypothesis testing is a six-step decision process by which a researcher systematically evaluates a claim against data. The steps are: state the hypotheses, choose a significance level and appropriate test, verify the test assumptions, compute the test statistic and p-value, decide whether to reject the null hypothesis, and interpret the result in context together with an effect size and confidence interval. Reporting the p-value alone is never sufficient.
The Concept and Its Logic
Hypothesis testing is the core framework of statistical inference. The null hypothesis (H0) asserts that there is no effect or difference; the alternative hypothesis (H1) represents the claim the researcher wishes to evaluate. The underlying logic is indirect proof: we calculate how probable the observed data would be if H0 were true. That probability is the p-value. When the p-value falls below a pre-specified significance threshold (α, commonly 0.05), H0 is rejected. The test statistic reduces sample size, standard deviation, and observed difference to a single number — for example a t-test uses t = (x̄ - μ0) / (s / √n).
Applying the Six Steps
Step 1: Write the research question explicitly as H0 and H1. Step 2: Choose α and the appropriate test for your data structure — for example an independent-samples t-test for two unrelated groups or chi-square for proportions. Step 3: Check assumptions such as normality, independence, and homogeneity of variance using plots and diagnostic tests. Step 4: Compute the test statistic and read the p-value from software or compare it to a critical value. Step 5: If p < α, reject H0; otherwise fail to reject it. Step 6: Interpret the result in domain terms and report an effect size (Cohen d, η², etc.) alongside a confidence interval.
Common Misconceptions
The most frequent mistake is treating the p-value as a measure of effect size or practical importance; it is only a probabilistic measure of incompatibility with H0. A second misconception is concluding that p > 0.05 proves H0; it merely means insufficient evidence was found to reject it. A third error is equating statistical significance with practical significance: in large samples, trivially small differences can yield very low p-values. Finally, skipping assumption checks can invalidate the entire procedure, since tests like the t-test lose their validity when underlying assumptions are violated.
Why It Matters and How to Report It
Hypothesis testing protects scientific decisions against chance, and when applied correctly it is powerful. APA guidelines and most journal policies now require more than a bare p-value. A complete report should include: the test type and degrees of freedom, the test statistic and exact p-value, the effect size, and a confidence interval. A model phrase: t(48) = 2.34, p = .023, d = 0.66, 95% CI [0.10, 1.22]. Reporting sample size and statistical power further strengthens transparency and allows readers to independently assess the practical meaning and replicability of the finding.