ScholarGate
সহকারী

Usability Testing

Usability testing evaluates an interface by observing representative users as they attempt realistic tasks, identifying where they struggle, succeed, or err.

Definition

Usability testing is an evaluation method in which representative users perform representative tasks with a system while observers record their behaviour, errors, and comments, in order to discover usability problems and measure performance.

Scope

This topic covers the empirical evaluation of interfaces with real users: planning test tasks, recruiting representative participants, running think-aloud and observation sessions, and collecting both performance data and verbal reports. It addresses formative testing to find and fix problems and summative testing to benchmark performance, along with sample-size considerations. It does not cover expert inspection methods such as heuristic evaluation, which are treated separately, nor the statistical treatment of metrics, treated under usability metrics and measurement.

Core questions

  • How are realistic test tasks and representative participants chosen?
  • What is the think-aloud protocol and what kind of data does it provide?
  • How do formative and summative usability tests differ in purpose?
  • How many participants are needed to find most usability problems?

Key concepts

  • representative tasks
  • representative users
  • think-aloud protocol
  • formative vs summative testing
  • task success and completion
  • facilitation and moderator effects
  • sample size
  • observation and logging

Key theories

Think-aloud protocol
Asking users to verbalize their thoughts while performing tasks externalizes their reasoning and points of confusion; Ericsson and Simon's analysis of verbal reports established when such reports validly reflect the contents of working memory.
Small-sample formative testing
Empirical studies suggest that a small number of participants uncovers a large proportion of usability problems, supporting iterative testing with several small rounds rather than one large study.
Test planning and facilitation
Effective usability tests rest on well-chosen tasks, a neutral facilitator who avoids leading participants, and careful recording, so that observed difficulties reflect the interface rather than the test setup.

Clinical relevance

Usability testing is the most direct way to see how real people use a product and is widely applied in software, web, and device development; in regulated areas such as medical devices, summative usability testing provides evidence that intended users can operate a system without dangerous errors.

History

Drawing on human-factors testing traditions, usability testing became central to software development in the 1980s and 1990s. Ericsson and Simon's 1980 work grounded the use of verbal reports, and practical guides by Dumas and Redish and others standardized how tests are planned and run. Debate over optimal sample size, sparked by Virzi and Nielsen, shaped modern formative practice.

Debates

How many users are enough to find usability problems?
Influential studies argued that a small number of users reveals most problems, but critics note this depends on problem frequency and task coverage, so larger or repeated tests may be needed for complex systems or summative claims.

Key figures

  • Jakob Nielsen
  • Joseph Dumas
  • Janice Redish
  • K. Anders Ericsson
  • Herbert A. Simon

Related topics

Seminal works

  • nielsen1993
  • ericsson1980
  • virzi1992

Frequently asked questions

What is the think-aloud method?
In the think-aloud method, participants are asked to say out loud what they are thinking as they work through tasks. This reveals their expectations, confusion, and reasoning in real time, helping evaluators understand not just where users fail but why.
Does usability testing need a special lab?
No. While dedicated labs with recording equipment are useful, valuable usability testing can be done in an office, a participant's own setting, or remotely over the internet. What matters most is realistic tasks, representative users, and careful observation, not expensive facilities.

Methods for this concept

Related concepts