0

I did a lot of research and came up with the code below. It successfully translates speech to text using the microphone.

I have a file on my webserver that streams audio via mp3. It is just a link to an mp3 file. I need to have that translated to text.

I am trying to figure out the best way to do this. So, can you select the audio input as the computers audio (ie play the audio in the web browser)? Or can you stream the audio directly to the translator? I think I need to use SetInputToWaveStream method, but do not understand how to use it.

Private Sub InitializeRecognizerSynthesizer()
Dim selectedRecognizer = ( _
    Where e.Culture.Equals(Thread.CurrentThread.CurrentCulture)).FirstOrDefault()
recognizer = New SpeechRecognitionEngine(selectedRecognizer)
recognizer.AudioStateChanged += New EventHandler(Of AudioStateChangedEventArgs)(recognizer_AudioStateChanged)
recognizer.SpeechHypothesized += New EventHandler(Of SpeechHypothesizedEventArgs)(recognizer_SpeechHypothesized)
recognizer.SpeechRecognized += New EventHandler(Of SpeechRecognizedEventArgs)(recognizer_SpeechRecognized)

synthesizer = New SpeechSynthesizer()
End Sub

Private Function SelectInputDevice() As Boolean
Dim proceedLoading As Boolean = True

If IsOscompatible() Then
    Try

        recognizer.SetInputToDefaultAudioDevice()
    Catch
            'no audio input device
        proceedLoading = False
    End Try
Else
    ThreadPool.QueueUserWorkItem(InitSpeechRecogniser)
End If
Return proceedLoading
End Function

1 Answers1

0

recognizer.SetInputToWaveFile(file) - will read the audio input from a file in the file system.

recognizer.SetInputToAudioStream - will read the audio input from a stream. A short example:

FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
SpeechAudioFormatInfo format = new SpeechAudioFormatInfo(8000, AudioBitsPerSample.Sixteen, AudioChannel.Mono);
recognizer.SetInputToAudioStream(fs, format);

When reading from a stream or a file you must use care to make sure that the audio data is in a supported format. For example, one format I know works on my machine is:

  • 8 bits per sample
  • single channel mono
  • 22,050 samples per second
  • PCM encoding

See Help with SAPI v5.1 SpeechRecognitionEngine always gives same wrong result with C# for more info about audio formats.

If your question is how to fetch a resource from a web server and handle it as a stream, see HttpWebResponse.GetResponseStream - http://msdn.microsoft.com/en-us/library/system.net.httpwebresponse.getresponsestream(v=vs.100).aspx

Community
  • 1
  • 1
Michael Levy
  • 13,097
  • 15
  • 66
  • 100