ScholarGate
Assistant

Compare methods

Review your selected methods side by side; rows that differ are highlighted.

N-gram Analysis×Keyness Analysis×
FieldLinguisticsLinguistics
FamilyProcess / pipelineProcess / pipeline
Year of origin19991997
OriginatorCorpus linguists (Douglas Biber; lexical bundles tradition)Mike Scott
TypeFrequency analysis of contiguous word sequencesCorpus comparison of relative word frequencies
Seminal sourceBiber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman. ISBN: 9780582237254Scott, M. (1997). PC analysis of key words — and key key words. System, 25(2), 233–245. DOI ↗
AliasesLexical Bundle Analysis, Cluster Analysis (corpus linguistics), Contiguous Sequence AnalysisKeyword Analysis, Corpus Keyness, Keyness Statistics
Related43
SummaryN-gram analysis is a corpus-linguistic technique that extracts and ranks every contiguous sequence of n words (or characters) in a corpus, exposing the recurrent multi-word units — two-word bigrams, three-word trigrams, and longer 'lexical bundles' — that make up a register or text type. By counting how often each sequence recurs, it reveals the prefabricated, formulaic backbone of language that single-word frequency lists cannot capture.Keyness analysis identifies the words that are characteristically frequent (or infrequent) in a target corpus relative to a reference corpus, using statistical tests to measure how unexpected each word's frequency is. Introduced by Mike Scott in 1997, it answers the question 'what is this text or collection distinctively about?' and is a central technique in corpus linguistics and corpus-assisted discourse analysis for surfacing the salient vocabulary of a genre, period, author, or social group.
ScholarGateDataset
  1. v1
  2. 3 Sources
  3. PUBLISHED
  1. v1
  2. 3 Sources
  3. PUBLISHED

Go to search Download slides

ScholarGateCompare methods: N-gram Analysis · Keyness Analysis. Retrieved 2026-06-24 from https://scholargate.app/en/compare