Porovnat metody
Prohlédněte si vybrané metody vedle sebe; řádky, které se liší, jsou zvýrazněny.
| Keyword-in-Context (KWIC) Analysis× | N-gram Analysis× | |
|---|---|---|
| Obor | Lingvistika | Lingvistika |
| Rodina | Process / pipeline | Process / pipeline |
| Rok vzniku≠ | 1960 | 1999 |
| Tvůrce≠ | H. P. Luhn (information retrieval); adopted in corpus linguistics by John Sinclair | Corpus linguists (Douglas Biber; lexical bundles tradition) |
| Typ≠ | Indexing and display technique aligning a keyword with its surrounding co-text | Frequency analysis of contiguous word sequences |
| Původní zdroj≠ | Luhn, H. P. (1960). Key word-in-context index for technical literature (KWIC index). American Documentation, 11(4), 288–295. DOI ↗ | Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman. ISBN: 9780582237254 |
| Další názvy | KWIC Index, Key Word in Context, Concordance Line Display | Lexical Bundle Analysis, Cluster Analysis (corpus linguistics), Contiguous Sequence Analysis |
| Příbuzné | 4 | 4 |
| Shrnutí≠ | Keyword-in-context (KWIC) analysis is the indexing and display technique that presents every occurrence of a chosen keyword aligned in a fixed central column, flanked by a set span of the words that precede and follow it. Invented by H. P. Luhn in 1960 to index technical literature, the KWIC format became the standard way to read a concordance: by stacking instances of the keyword so they line up vertically, it lets an analyst scan the surrounding co-text for recurrent neighbors and patterns. It is the specific display layer underlying broader corpus concordance work, valued because alignment turns a list of scattered occurrences into a visually legible pattern. Today KWIC views are the default output of every corpus-analysis tool and the entry point for studying collocation, colligation, and meaning in context. | N-gram analysis is a corpus-linguistic technique that extracts and ranks every contiguous sequence of n words (or characters) in a corpus, exposing the recurrent multi-word units — two-word bigrams, three-word trigrams, and longer 'lexical bundles' — that make up a register or text type. By counting how often each sequence recurs, it reveals the prefabricated, formulaic backbone of language that single-word frequency lists cannot capture. |
| ScholarGateDatová sada ↗ |
|
|