I am working on speaker identification project. To identify either speaker is same or not for different voice clips, i extract multiple features such as MFCC, tempo, chromagram,beat times, harmonic, percussive, melspectrogram, etc. Now i also want to find the pitch of a voice clip, to find the pitch i am using this code:
import librosa
y,sr = librosa.load('E:/Audio_clip/cant.wav')
S = np.abs(librosa.stft(y))
#print(S)
pitch, mag = librosa.piptrack(y=y, sr=sr, S=S)
But when i am printing pitch and mag output into my console, it gives me the same output for all clips and output is in 0 array list for both pitch and mag and also gives the error: 'NoneType' object is not iterable'
Can anybody suggest me where I am wrong or how can I find pitch of a voice clip ?
Till now my plan to identification of a speaker is that, firstly I want to create a feature matrix of these features, after that i will use cosine similarity function to find the voice from same speaker or not. Is this good approach for speaker identification?