Parsing speech to text from the sound card

Question

I've been doing some experimentation with PortAudio, and with natural language processing, and I'm wondering if there is some way to combine the two. What I would like is a service that can take the audio from a video (video format can be whatever is easiest), and parse it to text as it is played. I know this is fairly straightforward for Android, but I would like this to work on a computer or laptop, and if possible in real time. Once I have the speech being played by the sound card in the video I already have my mapping determined, but I am a little lost on how to implement the first part. I know about Dragon, but I would prefer to use something non-proprietary for the actual speech to text parsing. The parser need not be perfect, I can deal with spelling errors and irregular word forms. Any thoughts?

That said, it *sounds* like you are trying to capture audio as it is being played back through the sound card (a la audio hijack pro). Port audio does not do that. On the mac, you want something like soundflower. Not sure about other platforms. — Bjorn Roche, Jun 14 '12 at 15:06
There are some other good questions with answers that might help you get started. Try http://stackoverflow.com/q/6348770/90236 — Michael Levy, Jun 14 '12 at 17:59

Parsing speech to text from the sound card

0 Answers0