Speaker Recognition System using Python

Question

I'm trying to make a Speaker recognition (not speech but speaker) system using Python. I've extracted mfcc features of both train audio file and test audio file and have made a gmm model for each. I'm not sure how to compare the models to compute a score of similarity based on which I can program the system to validate the test audio. I'm struggling for 4 days to get this done. Would be glad if someone can help.

im taking 3 audio files to model a training set(.gmm) and then taking one more audio clip(test clip) to compare it with the training model to compute the similarity — Tilak Sharma, Apr 22 '18 at 08:06
Possible duplicate of [Python Speaker Recognition](https://stackoverflow.com/questions/7309219/python-speaker-recognition) — Nikolay Shmyrev, Apr 22 '18 at 15:06

score -1 · Answer 1 · answered Apr 22 '18 at 08:09

From what I can understand from the question, you are describing an aspect of the cocktail party problem I have found a whitepaper with a solution to your problem using a modified iterative Wiener filter and a multi-layer perceptron neural network that can separate speakers into separate channels.

Intrestingly the cocktail party problem can be solved in one line in ocatve: [W,s,v]=svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');
you can read more about it on this stackoverflow post

Speaker Recognition System using Python

1 Answers1