Process / pipelineCorpus linguistics

Keyness Analysis

Keyness analysis identifies the words that are characteristically frequent (or infrequent) in a target corpus relative to a reference corpus, using statistical tests to measure how unexpected each word's frequency is. Introduced by Mike Scott in 1997, it answers the question 'what is this text or collection distinctively about?' and is a central technique in corpus linguistics and corpus-assisted discourse analysis for surfacing the salient vocabulary of a genre, period, author, or social group.

Open in MethodMindSoonApply, compare, get guidance

Tools & resources

Download slides

Learn & explore

VideoSoon

Read the full method

Members only

Method map

The neighbourhood of related methods — select a node to explore.

Keyness Analysis

Collocation Analysis Corpus Concordance Analy…Critical Discourse Analy…Collostructional Analysis Corpus-Assisted Discours…Multidimensional Registe…N-gram Analysis

Sources

Scott, M. (1997). PC analysis of key words — and key key words. System, 25(2), 233–245. DOI: 10.1016/S0346-251X(97)00011-0 ↗
Baker, P. (2006). Using Corpora in Discourse Analysis. Continuum. ISBN: 9780826477248
Gabrielatos, C. (2018). Keyness analysis: Nature, metrics and techniques. In C. Taylor & A. Marchi (Eds.), Corpus Approaches to Discourse: A Critical Review (pp. 225–258). Routledge. ISBN: 9781138895157

How to cite this page

ScholarGate. (2026, June 22). Keyword and Keyness Analysis in Corpus Linguistics. ScholarGate. https://scholargate.app/en/linguistics/keyness-analysis

Which method?

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.

Collocation AnalysisText mining↔ compare
Corpus Concordance AnalysisLinguistics↔ compare
Critical Discourse AnalysisQualitative↔ compare

Compare side by side →

Referenced by

Collostructional Analysis Corpus Concordance Analysis Corpus-Assisted Discourse Studies Multidimensional Register Analysis N-gram Analysis

Related reference concepts

Corpus Linguistics and Web Corpora Stylometry and Authorship Attribution Computational Text Analysis Evaluation and Annotation Topic Modeling and Text Mining Lexical and Corpus Resources

Spotted an issue on this page? Report or suggest a fix →