Speech recognition (web) services?

Question

I have a buffer of audio and I'd like to perform speech recognition/transcription on it. I have limited CPU and RAM locally so I want to perform recognition on a server.

Are there any (web) services that allow me to do this?

My searches so far have led nowhere...

I found Spinvox Create, too... – Dave Peck Apr 22 '10 at 18:54 — Dave Peck, Apr 22 '10 at 18:54

score 2 · Answer 1 · answered Feb 12 '11 at 07:16

2

Google has just introduced browser-based access to its speech engine through HTML5.

http://slides.html5rocks.com/#speech-input

To get this page to work, I launched the Chromium browser as follows in Ubuntu:

$ chromium-browser --enable-speech-input

I believe that the idea is to be able to build applications that use Google's speech recognizer, but I haven't had a chance to look deeply into it.

Another interesting project is WAMI from MIT: http://wami.csail.mit.edu

answered Feb 12 '11 at 07:16

wwwilliam

9,142
8
30
33

2

And... since Chromium is OSS, I just spent some time and discovered that yes, indeed, there is a RESTful service endpoint that it talks to. It shouldn't be too hard to build a separate library for invoking recognition... – Dave Peck Feb 13 '11 at 04:18
I didn't work on it, although it should be fairly straightforward to implement an API in Python/Ruby/etc that does what Chromium does... assuming you can find a Speex codec API for your language of choice. – Dave Peck Jun 06 '12 at 04:09

score 1 · Answer 2 · answered Apr 18 '10 at 21:50

1

Lumenvox offers such a service but seems expensive for your needs.

answered Apr 18 '10 at 21:50

clyfe

23,695
8
85
109

This is a good find, though their programmer documentation is nonexistent. Looks like it is "buy first, understand later." I also found Spinvox Create, for which the docs are available -- but it is a terrible bunch of web API cruft, requiring custom headers, digest authentication, multipart posts containing XML and 64-encoded audio in a format that's not outrageous but not easily converted to from my device... – Dave Peck Apr 22 '10 at 18:56

Speech recognition (web) services?

2 Answers2

Linked