0

I am interested in running the webkitSpeechRecognition API programmatically. I want to take an audio file that is uploaded to a server and use the webkitSpeechRecognition API on the back-end to recognize the text and return the result to the client.

One possibility is running some form of "embedded" version of Chrome, but I'm not sure how I would pass in the audio input. Another would be to use some form of C++ bindings to access the API, but I'm not sure if this is overly complicated.

Is this possible? How could this be accomplished?

user2398029
  • 6,699
  • 8
  • 48
  • 80

1 Answers1

1

I've done this before, but not on any large scale. I used this software,

http://vb-audio.pagesperso-orange.fr/Cable/index.htm

that I found from this link

Play audio as microphone input

With that you can recognize anything you play through your speakers, the program makes a virtual mic that streams audio from virtual speakers that it creates.

As far as your embedded version of chrome, you could try grabbing the chromium source and replacing the code where they read from the mic with code to read from a file, I don't know how far you are going to get with that though, I've never read that code.

Community
  • 1
  • 1