ScholarGate
Assistant
Process / pipelineCorpus linguistics

Keyness Analysis

Keyness analysis identifies the words that are characteristically frequent (or infrequent) in a target corpus relative to a reference corpus, using statistical tests to measure how unexpected each word's frequency is. Introduced by Mike Scott in 1997, it answers the question 'what is this text or collection distinctively about?' and is a central technique in corpus linguistics and corpus-assisted discourse analysis for surfacing the salient vocabulary of a genre, period, author, or social group.

Open in MethodMindSoonApply, compare, get guidance
Tools & resources
Download slides
Learn & explore
VideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Method map

The neighbourhood of related methods — select a node to explore.

Sources

  1. Scott, M. (1997). PC analysis of key words — and key key words. System, 25(2), 233–245. DOI: 10.1016/S0346-251X(97)00011-0
  2. Baker, P. (2006). Using Corpora in Discourse Analysis. Continuum. ISBN: 9780826477248
  3. Gabrielatos, C. (2018). Keyness analysis: Nature, metrics and techniques. In C. Taylor & A. Marchi (Eds.), Corpus Approaches to Discourse: A Critical Review (pp. 225–258). Routledge. ISBN: 9781138895157

How to cite this page

ScholarGate. (2026, June 22). Keyword and Keyness Analysis in Corpus Linguistics. ScholarGate. https://scholargate.app/en/linguistics/keyness-analysis

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Compare side by side

Referenced by

ScholarGateKeyness Analysis (Keyword and Keyness Analysis in Corpus Linguistics). Retrieved 2026-06-24 from https://scholargate.app/en/linguistics/keyness-analysis · Dataset: https://doi.org/10.5281/zenodo.20539026