3

Google has recently made great progress with their speech recognition software, which is used in several open source products, e.g. Chromium Web Speech and Android Handsfree texting. I would like to use their speech recognition as part of my server stack, however I can't find much about it.

Is the text recognition software available as a library or package? Or alternatively, can I call chromium from another program to transcribe some audio file to text?

Jeroen Ooms
  • 31,998
  • 35
  • 134
  • 207
  • See similar questions http://stackoverflow.com/questions/12489321/using-google-api-speech-to-text-on-pc-version and http://stackoverflow.com/questions/7879804/does-anyone-uses-google-speech-api-in-production – Michael Levy Mar 24 '13 at 20:31
  • I think these answers may be outdated, Google has started to making some parts public early 2013. – Jeroen Ooms Mar 25 '13 at 19:55
  • got a link? It would be helpful. – Michael Levy Mar 25 '13 at 22:08
  • E.g. http://bgr.com/2013/01/14/google-chrome-speech-recognition-api-291569/ and https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#api_description. But this is about interfacing in Chrome, I can't find it as a standalone library. – Jeroen Ooms Mar 25 '13 at 23:21

2 Answers2

1

The Web Speech API's are designed only to be used in the context of either Chrome or Android. There is a lot of work that goes on in the client so there is no public server to server API that would just take an audio file and process it.

If you search github you find tools such as https://gist.github.com/alotaiba/1730160 but I am pretty certain that this method of access is 100% not supported, endorsed or confirmed to keep working.

Kinlan
  • 16,315
  • 5
  • 56
  • 88
1

The method previously stated at https://gist.github.com/alotaiba/1730160 does work for me. I use it on a daily basis in my home automation programs. I use a python script to capture audio and determine what is useful audio or just noise, then it sends the little audio snippet to google and returns the text all under a second!! I have successfully integrated it into my programs and if you google around you will find even more people that have as well!