c# RTSP audio to FFMPEG to SpeechRecognitionEngine

Question

I'm trying to get an audiostream (from any source file/other stream/...) into the microsoft speech recognition engine.

So far I've got:

ffmpeg.exe -rtsp_transport tcp -i rtsp://%_return1%/audio -acodec pcm_u16le -f rtp rtp://localhost:2222

Then I have inside my code:

SpeechRecognitionEngine _engine = new SpeechRecognitionEngine(CultureInfo.CurrentCulture);    
this._engine.SetInputToAudioStream(this._rtpClient.AudioStream, new SpeechAudioFormatInfo(16000, AudioBitsPerSample.Sixteen, AudioChannel.Mono));

Then I have the events registered:

this._engine.SpeechRecognized += this.SpeechRegocnized;

this._engine.SpeechDetected += this.EngineOnSpeechDetected;

I'm not sure about the codec settings... I've tried other codecs but doesn't work.

Stream #0:0 -> #0:0 (pcm_mulaw (native) -> pcm_s16le (native)) It's not working. Nothing happens, no detection event nothing. When I connect the SpeechRecognitionEngine to my laptop mic it does work. When I play the stream with VLC (RTSP) it do hear the audio stream. — MrH40XX, Apr 21 '16 at 19:42
Well, are you using the code from this answer? http://stackoverflow.com/a/15934124/432021 In that code, to you start the client? — Nikolay Shmyrev, Apr 21 '16 at 20:50
Hi, yes I'm using that code! The only thing is that my source is a RTSP audio stream.. while he says he also has that.. the commands supplied to ffmpeg suggest different. I assume that: ecognizer.SetInputToAudioStream( rtpClient.AudioStream, new SpeechAudioFormatInfo(WAVFile.SAMPLE_RATE, AudioBitsPerSample.Sixteen, AudioChannel.Mono)); WAVFile.SAMPLE_RATE would be 8000 in my case. Furthermore: ffmpeg -rtsp_transport tcp -i rtsp://%_return1%/audio -ac 1 -ar 16000 -acodec pcm_s16le -f rtp rtp://127.0.0.1:2222 — MrH40XX, Apr 22 '16 at 15:26

c# RTSP audio to FFMPEG to SpeechRecognitionEngine

0 Answers0