Process / pipelineAudio Signal Processing

MFCC（梅尔频率倒谱系数）

梅尔频率倒谱系数（MFCCs）是一种模仿人类听觉感知的音频特征紧凑表示。MFCCs由Davis和Mermelstein于1980年提出，是语音识别和环境声音分析的事实上的特征提取方法。它们将音频信号的频率信息压缩成一小组系数，这些系数捕捉了语音内容，同时丢弃了不相关的细节。

在 MethodMind 中打开即将推出视频即将推出Download slides

阅读完整方法

仅限会员

使用免费账户登录即可阅读本节。

The neighbourhood of related methods — select a node to explore.

MFCC（梅尔频率倒谱系数）

Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 357-366. DOI: 10.1109/TASSP.1980.1163420 ↗
Young, S. J., Evermann, G., Gales, M. J., et al. (1996). The HTK Book. Cambridge University Engineering Department. link ↗
Moustakides, G. V., & Rougui, J. A. (2004). Optimal filtering for polynomial signal models. IEEE Transactions on Signal Processing, 52(8), 2219-2230. link ↗

ScholarGate. (2026, June 3). Mel-Frequency Cepstral Coefficients. ScholarGate. https://scholargate.app/zh/applied-physics/mfcc

Set this method beside its closest kin and read them side by side — the library lays the books on the table; the choice is yours.