ScholarGate
Assistant

Compare methods

Review your selected methods side by side; rows that differ are highlighted.

Keyword-in-Context (KWIC) Analysis×N-gram Analysis×
FieldLinguisticsLinguistics
FamilyProcess / pipelineProcess / pipeline
Year of origin19601999
OriginatorH. P. Luhn (information retrieval); adopted in corpus linguistics by John SinclairCorpus linguists (Douglas Biber; lexical bundles tradition)
TypeIndexing and display technique aligning a keyword with its surrounding co-textFrequency analysis of contiguous word sequences
Seminal sourceLuhn, H. P. (1960). Key word-in-context index for technical literature (KWIC index). American Documentation, 11(4), 288–295. DOI ↗Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman. ISBN: 9780582237254
AliasesKWIC Index, Key Word in Context, Concordance Line DisplayLexical Bundle Analysis, Cluster Analysis (corpus linguistics), Contiguous Sequence Analysis
Related44
SummaryKeyword-in-context (KWIC) analysis is the indexing and display technique that presents every occurrence of a chosen keyword aligned in a fixed central column, flanked by a set span of the words that precede and follow it. Invented by H. P. Luhn in 1960 to index technical literature, the KWIC format became the standard way to read a concordance: by stacking instances of the keyword so they line up vertically, it lets an analyst scan the surrounding co-text for recurrent neighbors and patterns. It is the specific display layer underlying broader corpus concordance work, valued because alignment turns a list of scattered occurrences into a visually legible pattern. Today KWIC views are the default output of every corpus-analysis tool and the entry point for studying collocation, colligation, and meaning in context.N-gram analysis is a corpus-linguistic technique that extracts and ranks every contiguous sequence of n words (or characters) in a corpus, exposing the recurrent multi-word units — two-word bigrams, three-word trigrams, and longer 'lexical bundles' — that make up a register or text type. By counting how often each sequence recurs, it reveals the prefabricated, formulaic backbone of language that single-word frequency lists cannot capture.
ScholarGateDataset
  1. v1
  2. 3 Sources
  3. PUBLISHED
  1. v1
  2. 3 Sources
  3. PUBLISHED

Go to search Download slides

ScholarGateCompare methods: Keyword-in-Context (KWIC) Analysis · N-gram Analysis. Retrieved 2026-06-25 from https://scholargate.app/en/compare