Extracting features from audio signal

Question

I have just started to work on data in the form of audio. I am using librosa as a tool. My project requires me to extract features like:

Total duration of the audio
Minimum Intensity of the audio signal
Maximum Intensity of the audio signal
Mean Intensity of the audio signal
Jitter
Rate of speaking
Number of Pauses
Maximum Duration of Pauses
Average Duration of Pauses
Total Duration of Pauses

Although, I know about these terms but I have no idea how to extract these from an audio file. Are these inbuilt in some form in the librosa.feature variable? Or we need to manually calculate these? Can someone guide me how to proceed?

I know that this job can be performed using softwares like Praat, but I need to do it in python.

Praat can be used for spectral analysis (spectrograms), pitch analysis, formant analysis, intensity analysis, jitter, shimmer, and voice breaks.

You can start reading the documentation https://librosa.github.io/librosa/ — Atirag, Feb 11 '19 at 21:17
I have already gone through the doc and turned to SO only because the doc didn't help me. — paradocslover, Feb 11 '19 at 21:19
`import scipy.io.wavfile` https://stackoverflow.com/a/24391521/1755108 — brokenfoot, Feb 11 '19 at 21:24
@brokenfoot can you please explain how to use the obtained data? I got a bit of hint but it would be really helpful if you could throw some more light. — paradocslover, Feb 11 '19 at 21:37
For eg to get duration, `scipy.io.wavfile.read(filename)` gives you 1. rate ie samples/sec, and 2. samples. You can calculate duration by `$2/$1`.. — brokenfoot, Feb 11 '19 at 21:45
can you please explain the same for `number of pauses`? actually the function returns a 2d numpy array and i am not able to understand the contents of that array. is it intensity/ frequency/ something else? — paradocslover, Feb 11 '19 at 21:59
It's 2-d array because it is dual channel. You can access each channel as `data[:,0]`, `data[:,1]`. Data is intensity, not frequency - you need FFT to get frequency. To get no of pauses, you need to define what a pause is. I don't think there is a function in scipy that look at the audio and detect pauses in it. — brokenfoot, Feb 12 '19 at 00:23

Extracting features from audio signal

0 Answers0