iPhone: Recognize the human voice accurately

Question

I am developing an application where I need to recognize human(to be precise baby crying) voice. I referred following articles for recording sound on iPhone microphone and sample it.

http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/ http://developer.apple.com/library/ios/#samplecode/aurioTouch/Introduction/Intro.html http://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html

...but I didn't get how can I accurately distinguish the human voice from any other voice. Any help or sample code on this would be really helpful.

So far I wrote following code:

-(void)levelTimerCallback:(NSTimer *)timer { 
  [recorder updateMeters]; 
  const double ALPHA = 0.05; 
  double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0]));
  lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults; 
  NSLog(@"frequency: %f", lowPassResults); 
  NSLog(@"Average input: %f Peak input: %f", [recorder averagePowerForChannel:0], [recorder peakPowerForChannel:0]); 
  if (lowPassResults < 0.95) 
    [self playSound]; 
}

Thanks.

Aha...I forgot to append my code. :) -(void)levelTimerCallback:(NSTimer *)timer { [recorder updateMeters]; const double ALPHA = 0.05; double peakPowerForChannel = pow(10, (0.05 * [recorder peakPowerForChannel:0])); lowPassResults = ALPHA * peakPowerForChannel + (1.0 - ALPHA) * lowPassResults; NSLog(@"frequency: %f", lowPassResults); NSLog(@"Average input: %f Peak input: %f", [recorder averagePowerForChannel:0], [recorder peakPowerForChannel:0]); if (lowPassResults < 0.95) [self playSound]; } — Paresh Masani, Apr 26 '11 at 10:17
I basically used code given at http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/. Based on Andrew's response I don't think I will be able to recognise baby crying sound. Is there anyway to find out what would be the value of lowPassResults in my above code for baby crying sound? Is there any documents that says different sounds with their frequency/amplitude? — Paresh Masani, Apr 26 '11 at 10:23

score 0 · Accepted Answer · edited May 23 '17 at 11:44

0

This is a very difficult problem. Speech recognition is a complex subject, and even massive companies can't get it right. A suggestion would be to sample it and see if it is within a certain, high-pitched range. Beyond that, you would need to read up on speech recognition theory.

As this answer shows, it is not within the range of the iPhone SDK, so it will not be a simple answer.

edited May 23 '17 at 11:44

Community

1
1

answered Apr 21 '11 at 14:40

OrangeAlmondSoap

3,562
2
18
14

Thanks Andrew. I don't want exact sound recognition functionality but if I could find out what could be the frequency/amplitude of baby crying then that would fine too. – Paresh Masani Apr 26 '11 at 10:38

iPhone: Recognize the human voice accurately

1 Answers1