0

I came across this nice tutorial https://github.com/manashmndl/DeadSimpleSpeechRecognizer where the data is trained based on samples separated by folders and all mfcc are calculated at once.

I am trying to achieve something similar but in a different way.

Based on this : https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html

librosa can compute mfcc for any audio. as follows :

import librosa  
y, sr = librosa.load('test.wav')
mymfcc= librosa.feature.mfcc(y=y, sr =sr)  

but I want to calculate mfcc for the audio part by part based on timestamps from a file.

the file has labels and timestamps as follows :

0.0 2.0 sound1
2.0 4.0 sound2
4.0 7.0 silence
7.0 11.0 sound1

I want to calculate mfcc of each range, my hope is to arrive at a labelled train data that looks like mfcc and its corresponding label. mfcc_1 , sound1 mfcc_2, sound2
and so on.

How do I achieve this?

I looked at generate mfcc's for audio segments based on annotated file , and question is similar but I found both the question and answer somewhat hard to follow (because I'm very new to this field).

TIA

UPDATE: My Code :

import librosa
from subprocess import call

def ListDir():
    call(["ls", "-l"])

def main():
    ListDir()
    readfile_return_segmentsmfcc()

my_segments =[]
# reading annotated file
def         readfile_return_segmentsmfcc():

    pat ='000.mp3'
    y, sr = librosa.load(pat)

    print "\n sample rate :"
    print sr

    with open("000.txt", "rb") as f:
        for line in f.readlines():
            start_time, end_time, label = line.split('\t')
            start_time = float(start_time)
            end_time = float(end_time)
            label = label.strip()
            my_segments.append((start_time, end_time, label))

            start_index = librosa.time_to_samples(start_time)
            end_index = librosa.time_to_samples(end_time)

            required_slice = y[start_index:end_index]
            required_mfcc = librosa.feature.mfcc(y=required_slice, sr=sr)
            print "Mfcc size is {} ".format(mfcc.shape)


            print start,end,label


    return my_segments


main()
DJ_Stuffy_K
  • 615
  • 2
  • 11
  • 29
  • 1
    In my case, I just used ` required_slice = y[start_index[0]:end_index[0] ` didn't have to use the int() conversion. – kRazzy R Feb 04 '18 at 20:36

1 Answers1

2
  • read the start and end times:
    start=2.0 end=4.0

  • convert to samples index using librosa.time_to_samples:
    start_index = librosa.time_to_samples(start)
    end_index = librosa.time_to_samples(end)

  • use python [:] operator to get the relevant slice from data:
    slice = y[int(start_index):int(end_index)]

  • compute mfcc on slice, etc.

Eran W
  • 1,696
  • 15
  • 20
  • 1
    Hope it helped - Please, consider accepting my answer. ;) – Eran W Jan 31 '18 at 06:48
  • thank you for your answer. Please suggest how to store the mfcc's generated in the last step? because the mfcc.shape is (20,X) where X is dependent on the length of a particular segment. so in above example, 4 different mfcc will be generated. – DJ_Stuffy_K Feb 01 '18 at 02:49
  • I get the following error : `Traceback (most recent call last): File "generate_mfcc_based_on_segments.py", line 46, in main() File "generate_mfcc_based_on_segments.py", line 10, in main read_annotate_file_return_segments_and_mfcc() File "generate_mfcc_based_on_segments.py", line 33, in read_annotate_file_return_segments_and_mfcc required_slice = y[start_index:end_index] TypeError: only integer scalar arrays can be converted to a scalar index ` Updated question with my code – DJ_Stuffy_K Feb 01 '18 at 04:07
  • 1
    The error is because index should be an integer. I added conversion to int – Eran W Feb 01 '18 at 06:15
  • your thoughts on whether a namedtuple should be used ? – DJ_Stuffy_K Feb 05 '18 at 19:12
  • can you kindly please tell how to save these mfcc's that are generated in the last step : ` required_mfcc = librosa.feature.mfcc(y=required_slice, sr=sr) ` – DJ_Stuffy_K Feb 05 '18 at 21:22
  • 1
    That is on a more broader scope than this question, and highly depends on what is the application and how are you going to use the saved data. If you provide in a new question what you are trying to achieve, and what you have tried - you increase the likelihood to get help :) – Eran W Feb 06 '18 at 05:24
  • 1
    I want to feed the mfcc and the labels to train a neural network. – DJ_Stuffy_K Feb 06 '18 at 21:39
  • @EranW I am working on a very similar project, my question, approach and where I'm stuck is here : https://stackoverflow.com/questions/48514322/train-a-neural-network-with-mfcc-using-keras it will be great if you could shed some light. thanks – kRazzy R Feb 13 '18 at 19:21