Speech recognition with LSTM with features extracted in MFCC

Question

Studying the deep neural networks, specifically the LSTM, I decided to follow the idea proposed in this link: Building Speech Dataset for LSTM binary classification to build a classifier.

I have an audio-based, where the features to extract MFCC, where each array is 13x56 each phoneme of a word. training data would be like this:

X = [[phon1fram[1][1], phon1fram[1][2],..., phon1fram[1][56]], 
     [phon1fram[2][1], phon1fram[2][2],..., phon1fram[2][56]], ....   
     [phon1fram[15][1], phon1fram[15][2], ..., phon1fram[15][56] ] ]
     ...
     ...
     [[phon5fram[1][1], phon5fram[1][2],..., phon5fram[1][56]], ... ,
     [phon5fram[15][1], phon5fram[15][2], ..., phon5fram[15][56]] ]

in lettering which is certainly the first frames labels would be said as "intermediaries" and only the last frame actually represent the phoneme?

Y = [[0, 0, ..., 0],        #intermediary
     [0, 0, ..., 0], ... ,  #intermediary
     [1, 0, ..., 0]]        # is one phoneme
    [[0, 0, ..., 0], ...    #intermediary
     [0, 1, ..., 0]         # other phoneme

This would be really correct? During the first tests I performed all my outlets expected tended to label this "middleman" for being the most prevalent. Any other approach could be used?

score 1 · Answer 1 · answered Jul 15 '16 at 12:05

I am doing the same task. I am using http://keras.io/layers/recurrent/ to do the task.Use keras with theano backend to accomplish this task. You can follow these steps :

Store Mfcc values in a TXT file.
Read TXT file and store all the values to a Numpy array.
Pass this numpy array to the input of your neural net.
Apply padding before feeding the input

You can play around with the hyperparamters(batch_size,optimizer, loss function, sequnece size ) for evaluating results .

hey man , can you please share github repo if possible, I'm trying to do something very similar — DJ_Stuffy_K, Jan 31 '18 at 06:21

Speech recognition with LSTM with features extracted in MFCC

1 Answers1