Process / pipelineAudio Signal Processing

MFCC (Mel-Frequency Cepstral Coefficients)

Mel-Frequency Cepstral Coefficients (MFCCs) are a compact representation of audio features that mimic human auditory perception. Introduced by Davis and Mermelstein in 1980, MFCCs are the de facto feature extraction method for speech recognition and environmental sound analysis. They compress the frequency information of audio signals into a small set of coefficients that capture phonetic content while discarding irrelevant details.

Open in MethodMindSoonVideoSoon

Read the full method

Members only

Sign in with a free account to read this section.

Sign in

Sources

  1. Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 357-366. DOI: 10.1109/TASSP.1980.1163420
  2. Young, S. J., Evermann, G., Gales, M. J., et al. (1996). The HTK Book. Cambridge University Engineering Department. link
  3. Moustakides, G. V., & Rougui, J. A. (2004). Optimal filtering for polynomial signal models. IEEE Transactions on Signal Processing, 52(8), 2219-2230. DOI: 10.1109/TSP.2004.831058

Related methods

Referenced by

ScholarGateMFCC (Mel-Frequency Cepstral Coefficients). Retrieved 2026-06-04 from https://scholargate.app/en/applied-physics/mfcc