Questions tagged [mfcc]

Mel-Frequency Cepstral Coefficients. The name given to an alternate representation of speech signals based on its frequency content. Very popular way to represent a speech signal as a feature vector. Used primarily for speech recognition tasks.

Mel Frequency Cepstral Coefficients (MFCC) are coefficients obtained when a speech signal is analysed by a series of filter banks with logarithmically spaced center frequencies on the Mel-scale. This choice of center frequencies is significant because it mimics the human ear. MFCC are computed from the magnitude mel-spectrogram by log-scaling, and applying the Discrete Cosine Transform to compute the cepstrum. MFCC is very popular for speech recognition tasks.

312 questions
22
votes
3 answers

Difference between mel-spectrogram and an MFCC

I'm using the librosa library to convert music segments into mel-spectrograms to use as inputs for my neural network, as shown in the docs here. How is this different from MFCCs, if at all? Are there any advantages or disadvantages to using either?
monadoboi
  • 1,651
  • 3
  • 16
  • 26
13
votes
5 answers

How to plot MFCC in Python?

Here is my code so far on extracting MFCC feature from an audio file (.WAV): from python_speech_features import mfcc import scipy.io.wavfile as wav (rate,sig) = wav.read("AudioFile.wav") mfcc_feat = mfcc(sig,rate) print(mfcc_feat) How can I plot…
E. Alicaya
  • 141
  • 1
  • 2
  • 5
11
votes
1 answer

Python Librosa : What is the default frame size used to compute the MFCC features?

Using Librosa library, I generated the MFCC features of audio file 1319 seconds into a matrix 20 X 56829. The 20 here represents the no of MFCC features (Which I can manually adjust it). But I don't know how it segmented the audio length into 56829.…
Rangooski
  • 825
  • 1
  • 11
  • 29
11
votes
1 answer

How to plot a multi-dimensional data point in python

Some background first: I want to plot of Mel-Frequency Cepstral Coefficients of various songs and compare them. I calculate MFCC's throughout a song and then average them to get one array of 13 coefficients. I want this to represent one point on a…
CatLord
  • 361
  • 1
  • 4
  • 14
10
votes
3 answers

MFCC Python: completely different result from librosa vs python_speech_features vs tensorflow.signal

I'm trying to do extract MFCC features from audio (.wav file) and I have tried python_speech_features and librosa but they are giving completely different results: audio, sr = librosa.load(file, sr=None) # librosa hop_length = int(sr/100) n_fft =…
TYZ
  • 8,466
  • 5
  • 29
  • 60
10
votes
1 answer

Understanding the output of mfcc

from librosa.feature import mfcc from librosa.core import load def extract_mfcc(sound): data, frame = load(sound) return mfcc(data, frame) mfcc = extract_mfcc("sound.wav") I would like to get the MFCC of the following sound.wav file…
9
votes
2 answers

How to generate MFCC Algorithm's triangular windows and how to use them?

I am implementing MFCC algorithm in Java. There is a sample code here: http://www.ee.columbia.edu/~dpwe/muscontent/practical/mfcc.m at Matlab. However I have some problems with mel filter banking process. How to generate triangular windows and how…
kamaci
  • 72,915
  • 69
  • 228
  • 366
9
votes
1 answer

How do Mel Frequency Cepstrum Coefficients work?

I allready have FFT and pitch + absolute frequency calculated in real-time from input of microphone. Now I want to calculate the timbre. I saw Mel Frequency Cepstrum Coefficients - MFCCs but I didn't understand it very well. Can someone give me some…
André
  • 146
  • 3
  • 5
8
votes
1 answer

Methods for determining acoustical similarity (but not fingerprinting)

I'm looking for methods that work in practise for determining some kind of acoustical similarity between different songs. Most of the methods I've seen so far (MFCC etc.) seem actually to aim at finding identical songs only (i.e. fingerprinting, for…
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
8
votes
2 answers

Mel Frequency Cepstral Coefficients (MFCC) in C/C++

Is there any implementation of MFCC available in C/C++? Any source codes or libraries? I've already found http://code.google.com/p/libmfcc/ which seem to be good.
Ali
  • 935
  • 2
  • 9
  • 10
7
votes
1 answer

Why do Mel-filterbank energies outperform MFCCs for speech commands recognition using CNN?

Last month, a user called @jojek told me in a comment the following advice: I can bet that given enough data, CNN on Mel energies will outperform MFCCs. You should try it. It makes more sense to do convolution on Mel spectrogram rather than on…
7
votes
1 answer

How to train a machine learning algorithm using MFCC coefficient vectors?

For my final year project i am trying to identify dog/bark/bird sounds real time (by recording sound clips). I am using MFCC as the audio features. Initially i have extracted altogether 12 MFCC vectors from a sound clip using jAudio library. Now I'm…
7
votes
1 answer

Building Speech Dataset for LSTM binary classification

I'm trying to do binary LSTM classification using theano. I have gone through the example code however I want to build my own. I have a small set of "Hello" & "Goodbye" recordings that I am using. I preprocess these by extracting the MFCC features…
Nirbhay Tandon
  • 318
  • 2
  • 13
7
votes
1 answer

MATLAB Murphy's HMM Toolbox

I am trying to learn HMM GMM implementation and created a simple model to detect some certain sounds (animal calls etc.) I am trying to train a HMM (Hidden Markov Model) network with GMM (Gaussian Mixtures) in MATLAB. I have a few questions, I could…
6
votes
2 answers

Why do MFCC extraction libs return different values?

I am extracting the MFCC features using two different libraries: The python_speech_features lib The BOB lib However the output of the two is different and even the shapes are not the same. Is that normal? or is there a parameter that I am…
SuperKogito
  • 2,998
  • 3
  • 16
  • 37
1
2 3
20 21