5

I would like to listen to the mic (I guess using AudioRecord) and perform some action the very moment a person starts to speak. I know I can buffer audio with AudioRecord, but how do I analyze it ?

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
nizzle
  • 1,036
  • 2
  • 10
  • 23
  • Did you ever get anywhere with this? – breadbin Jun 20 '12 at 11:16
  • Yes actually I did get it to activate when a voice initiated. I never finished the project or did any (other) Android apps in Java. All I could really still find was this: http://pastebin.com/2QDUxXHj not sure if usefulm but good luck... – nizzle Jun 21 '12 at 11:32
  • @nizzle. Did u find out your answer? – Mina Dahesh Jul 01 '16 at 12:18

1 Answers1

9

Well, the difficult part will be getting the phone to recognize that it's voice. You can set the voice recognition system as the input, instead of the mic, which might be able to do that. I don't think so though, because (I actually read all about this yesterday) the phone doesn't actually do the recognizing, it just opens up a live stream (like a phone call) to the Google servers, and they do the recognizing.

Also, the information that I have found so far points to the conclusion that Android does not support analysis of live audio from the mic. All these other apps that seem to be "live" are actually just taking a bunch of small samples and analyzing them really quickly so that they seem live. A 500 millisecond sample every 300 milliseconds seems to be common.

Luckily, on the side of my programming job, I'm also a sound technician, so I can tell you that (if you were willing to put in the work) there is a way to detect actual voice as opposed to just sound. Every voice is split into a few distinct ratios of frequencies which all combine to make the voice we hear, and every voice's ratios remains pretty constant, while each individual voice's ratios are different (which is why voice-based passwords work). So, if you were able to take a sample, break it up into frequencies of about 10hz each, and watch for the amplitude of each, and when you got a frequency/amplitude pattern that looked similar to a voice instead of just "white noise", you'd be in business. DOING that however, doesn't seem like it'd be easy at all. Something similar has been done before with the app called SpectralView, which displays the audio spectrum all broken up.

Also, as you can see by using the Voice Search, a voice also fluctuates a lot in how loud it is. You could look for that, but it wouldn't be as reliable.

In conclusion, how do you analyze it? Well, you would have to look for a pattern in the frequencies that looks like a voice. How do you do that? Well, to be honest, I don't know for sure. Sorry.

Brandon
  • 1,373
  • 2
  • 12
  • 19
  • 1
    That is one awesome response! Thanks for that. I did run into SpectralView on the Google boards, sounds like I should be looking into that then. I'm going to try to make a 500ms sample like you suggested and see where it lands me. Thanks! – nizzle Jan 11 '11 at 19:17
  • You're welcome, I'm glad to help! Good luck with this. It's not going to be easy. :) – Brandon Jan 12 '11 at 15:12
  • @Brandon. i want to make my app detect speech words,e.g. "Test" and do something as detecting.So, would you give me your suggestion? Thanks! – Mina Dahesh Jul 01 '16 at 12:23