Compare methods

Review your selected methods side by side; rows that differ are highlighted.

	Keyword-in-Context (KWIC) Analysis ×	N-gram Analysis ×
Field	Linguistics	Linguistics
Family	Process / pipeline	Process / pipeline
Year of origin≠	1960	1999
Originator≠	H. P. Luhn (information retrieval); adopted in corpus linguistics by John Sinclair	Corpus linguists (Douglas Biber; lexical bundles tradition)
Type≠	Indexing and display technique aligning a keyword with its surrounding co-text	Frequency analysis of contiguous word sequences
Seminal source≠	Luhn, H. P. (1960). Key word-in-context index for technical literature (KWIC index). American Documentation, 11(4), 288–295. DOI ↗	Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman. ISBN: 9780582237254
Aliases	KWIC Index, Key Word in Context, Concordance Line Display	Lexical Bundle Analysis, Cluster Analysis (corpus linguistics), Contiguous Sequence Analysis
Related	4	4
Summary≠	Keyword-in-context (KWIC) analysis is the indexing and display technique that presents every occurrence of a chosen keyword aligned in a fixed central column, flanked by a set span of the words that precede and follow it. Invented by H. P. Luhn in 1960 to index technical literature, the KWIC format became the standard way to read a concordance: by stacking instances of the keyword so they line up vertically, it lets an analyst scan the surrounding co-text for recurrent neighbors and patterns. It is the specific display layer underlying broader corpus concordance work, valued because alignment turns a list of scattered occurrences into a visually legible pattern. Today KWIC views are the default output of every corpus-analysis tool and the entry point for studying collocation, colligation, and meaning in context.	N-gram analysis is a corpus-linguistic technique that extracts and ranks every contiguous sequence of n words (or characters) in a corpus, exposing the recurrent multi-word units — two-word bigrams, three-word trigrams, and longer 'lexical bundles' — that make up a register or text type. By counting how often each sequence recurs, it reveals the prefabricated, formulaic backbone of language that single-word frequency lists cannot capture.
ScholarGateDataset ↗	v1 3 Sources PUBLISHED	v1 3 Sources PUBLISHED

Go to search → Download slides