方法对比
并排查看您选择的方法;存在差异的行会高亮显示。
| vocd-D (D Measure)× | N-gram Analysis× | |
|---|---|---|
| 领域 | 语言学 | 语言学 |
| 方法族 | Process / pipeline | Process / pipeline |
| 起源年份≠ | 2004 | 1999 |
| 提出者≠ | David Malvern & Brian Richards | Corpus linguists (Douglas Biber; lexical bundles tradition) |
| 类型≠ | Length-robust index of lexical diversity | Frequency analysis of contiguous word sequences |
| 开创性文献≠ | Malvern, D., Richards, B., Chipere, N., & Durán, P. (2004). Lexical Diversity and Language Development: Quantification and Assessment. Palgrave Macmillan. ISBN: 9781403902313 | Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman Grammar of Spoken and Written English. Longman. ISBN: 9780582237254 |
| 别名≠ | vocd-D, D Measure, vocd, HD-D | Lexical Bundle Analysis, Cluster Analysis (corpus linguistics), Contiguous Sequence Analysis |
| 相关 | 4 | 4 |
| 摘要≠ | vocd-D, also called the D measure, is a length-robust index of lexical diversity developed by David Malvern and Brian Richards. Instead of reporting a single type-token ratio, it characterizes how a text's TTR falls as sample size grows and fits that empirical curve to a one-parameter probabilistic model; the fitted parameter D is the diversity score, with higher D meaning richer vocabulary. HD-D, introduced by McCarthy and Jarvis, is the mathematically exact, sampling-free counterpart that computes the same underlying quantity directly from the hypergeometric distribution. | N-gram analysis is a corpus-linguistic technique that extracts and ranks every contiguous sequence of n words (or characters) in a corpus, exposing the recurrent multi-word units — two-word bigrams, three-word trigrams, and longer 'lexical bundles' — that make up a register or text type. By counting how often each sequence recurs, it reveals the prefabricated, formulaic backbone of language that single-word frequency lists cannot capture. |
| ScholarGate数据集 ↗ |
|
|