1

I've been searching everywhere for some form of gender detection by reading frequency data of a audio file. I've had no luck with finding a program that could do that or even anything that can output audio data so I can write a basic program to read it and manipulate it to determine gender of the speaker.

Do any of you know where I can find something to help me with this?

To reiterate, I basically want to have a program that when a person talks into a microphone it will say the gender of the speaker with a fair amount of precision. My full plan is to also have speech to text feature on it, so the program will write out what the speaker said and give some extremely basic demographics on the speaker.

*Preferably with a common scripting language thats cross platform or linux supported.

Cyrusc
  • 163
  • 1
  • 10
  • 4
    Possible repeat: http://stackoverflow.com/questions/5062032/audio-analysis-to-detect-human-voice-gender-age-and-emotion-any-prior-open – David Feb 04 '13 at 04:38
  • Possible duplicate of [Gender detection of the speaker from wave data of the audio](https://stackoverflow.com/questions/30397126/gender-detection-of-the-speaker-from-wave-data-of-the-audio) – Nikolay Shmyrev Apr 24 '18 at 07:13

2 Answers2

2

Though an old question but still if someone is interested in doing gender detection from audio, You can easily do this by extracting MFCC (Mel-frequency Cepstral coefficient) features and model it with machine learning model GMM (Gausssian Mixture model)

One can follow this tutorial which implements the same and has evaluated it on subset extracted from Google's AudioSet gender wise data.

https://appliedmachinelearning.wordpress.com/2017/06/14/voice-gender-detection-using-gmms-a-python-primer/

Abhijeet Singh
  • 154
  • 1
  • 8
1

You're going to want to look into formant detection and linear predictive coding. Heres a paper that has some signal flow diagrams that could be ported over to scipy/numpy.

blaise
  • 307
  • 1
  • 5