IBM speech to text using w4a format audio (Node.js)

Question

Im trying to use IBM speech to text API to transcript the audio in messenger to text script.

request({
        uri: attachment.url,
        method: 'GET',
        encoding: null
      }, (err, res, audio) => {... do something...}

I use request to get the audio file from the url I get in message. But I can only get w4a file, Watson only supports

audio/flac

audio/wav

audio/l16

audio/ogg

How do I convert w4a format audio to wav to fit the spec.

Or there exists other way to do that?

thanks.

The keywords to look for are "nodejs" and "ffmpeg". Something like http://stackoverflow.com/questions/33725893/how-do-you-use-node-js-to-stream-an-mp4-file-with-ffmpeg — Nikolay Shmyrev, May 23 '16 at 18:43
It is actually a watson question, so we need to add watson tag to get the right attention — Gireesh Punathil, Jun 08 '16 at 09:20
this is really a question about audio conversion, the audio formats supported by the Watson STT service are listed here: https://www.ibm.com/watson/developercloud/doc/speech-to-text/input.html note that webm is also supported — Daniel Bolanos, Jun 01 '17 at 22:15

score 2 · Answer 1 · answered Dec 14 '17 at 20:46

Use a package like audiobuffer-to-wav to convert your source audio file(.w4a) to Watson STT compatible format like mp3/wav, then use the converted file with Watson STT API. And, If you need to use this library server side, you can emulate AudioContext functionality with web-audio-api package.

IBM speech to text using w4a format audio (Node.js)

1 Answers1