What is the difference between formative and summative assessment?

Formative assessment is intended to support and guide further learning through feedback, while summative assessment is used to certify achievement and make decisions such as passing or progression.

What does Miller's pyramid describe?

It describes four levels of clinical competence - knows, knows how, shows how, and does - and helps match the assessment method to the level of competence being evaluated.

Educational Assessment and Learning Outcomes

Educational assessment is the process of gathering and interpreting evidence about what learners know and can do, against defined learning outcomes. It distinguishes assessment that supports learning (formative) from assessment that certifies achievement (summative), and it is judged by qualities such as validity, reliability, and educational impact.

Definition

Educational assessment is the systematic collection and interpretation of evidence of learning against intended outcomes, used either to support further learning (formative) or to make decisions about achievement and progression (summative); learning outcomes are the statements of what learners should be able to do that assessment is designed to measure.

Scope

This topic covers the purposes and qualities of assessment in health education, frameworks for what to assess, the contrast between formative and summative assessment, and the related idea of programme evaluation. It treats assessment as a methodological topic and is not a guide for grading or examining any specific course.

Core questions

What is the purpose of a given assessment - to support learning or to certify it?
Which level of competence does an assessment target?
What makes an assessment valid, reliable, and defensible?
How do individual assessments combine into a coherent programme?

Key concepts

Formative and summative assessment
Validity and reliability
Learning outcomes and objectives
Miller's pyramid of competence
Workplace-based assessment
Programmatic assessment
Programme evaluation

Key theories

Miller's pyramid: A framework describing four ascending levels of clinical assessment - knows, knows how, shows how, and does - used to match assessment methods to the level of competence being judged.
Programmatic assessment: An approach that treats individual assessments as data points combined deliberately over time, optimising the whole programme for both learning and decision-making rather than relying on isolated high-stakes tests.
Utility of assessment: The view that the value of an assessment is a product of several qualities - validity, reliability, educational impact, acceptability, and cost - that must be balanced rather than maximised individually.

Mechanisms

Assessment is designed by matching the method to the purpose and to the level of competence being judged. Miller's pyramid (Miller, 1990) orders methods from testing knowledge (knows, knows how) to observing performance (shows how, does), so that, for example, written tests suit lower levels and workplace observation suits higher ones. The chosen methods are then appraised for utility - validity, reliability, impact on learning, acceptability, and cost - and combined, in programmatic approaches, into a deliberate sequence of low- and high-stakes data points that together support both learning and robust decisions (Epstein, 2007; Van der Vleuten et al., 2012). Programme evaluation extends the same logic to judging the educational programme itself (Frye & Hemmer, 2012).

Clinical relevance

Assessment shapes what learners study and how educators judge competence, so understanding its principles supports the design and critique of fair, defensible evaluation in health education. The topic describes how learning is measured and is not a basis for individual clinical decisions.

Evidence & guidelines

Assessment practice in the health professions is guided by widely cited frameworks - Miller's pyramid for matching methods to competence (Miller, 1990), the utility concept and reviews of assessment methods (Epstein, 2007), and programmatic assessment for combining evidence over time (Van der Vleuten et al., 2012). Programme evaluation draws on established models such as those summarised by Frye and Hemmer (2012). Much of this evidence is conceptual and consensus-based rather than experimental.

History

Assessment in the health professions shifted over the late twentieth century from a focus on knowledge testing toward the direct observation of performance, crystallised by Miller's 1990 pyramid. Subsequent decades emphasised the multidimensional utility of assessment, workplace-based methods, and - more recently - programmatic approaches that integrate many assessments over time rather than relying on single high-stakes examinations.

Debates

Can validity and reliability be maximised at the same time?: Authentic, performance-based assessments often gain validity at some cost to standardisation and reliability, so designers must balance the qualities of an assessment rather than optimise any one, a tension central to the utility concept and programmatic approaches.

Key figures

George Miller
Cees van der Vleuten
Ronald Epstein
Lambert Schuwirth

Seminal works

miller-1990
epstein-2007
vandervleuten-2012

Frequently asked questions

What is the difference between formative and summative assessment?: Formative assessment is intended to support and guide further learning through feedback, while summative assessment is used to certify achievement and make decisions such as passing or progression.
What does Miller's pyramid describe?: It describes four levels of clinical competence - knows, knows how, shows how, and does - and helps match the assessment method to the level of competence being evaluated.