1

I am working on speaker identification project. To identify either speaker is same or not for different voice clips, i extract multiple features such as MFCC, tempo, chromagram,beat times, harmonic, percussive, melspectrogram, etc. Now i also want to find the pitch of a voice clip, to find the pitch i am using this code:

import librosa
y,sr = librosa.load('E:/Audio_clip/cant.wav')
S = np.abs(librosa.stft(y))
#print(S)
pitch, mag = librosa.piptrack(y=y, sr=sr, S=S)

But when i am printing pitch and mag output into my console, it gives me the same output for all clips and output is in 0 array list for both pitch and mag and also gives the error: 'NoneType' object is not iterable'

Can anybody suggest me where I am wrong or how can I find pitch of a voice clip ?

Till now my plan to identification of a speaker is that, firstly I want to create a feature matrix of these features, after that i will use cosine similarity function to find the voice from same speaker or not. Is this good approach for speaker identification?

Sandeep
  • 369
  • 1
  • 5
  • 16

1 Answers1

0

Zeroes are expected in the beginning, you just do not print/draw the array properly. Some values inside are non-zero but they are pretty rare, see for details the topic on librosa group.

Also see Librosa pitch tracking - STFT and How to print the full NumPy array?

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87