20

I need an API or library (preferably free) that will convert voice/speech through a microphone, into text (string).

Additionally, I will need an API or library that can do text-to-speech.

I'd like to use C# and .NET, but other languages will suffice.

Thanks.

charles_har
  • 275
  • 1
  • 4
  • 6

6 Answers6

16

You can use CMU Sphinx as it is pretty open and scalable solution and I think it can be used at both client and server side:

http://cmusphinx.sourceforge.net/

If you are looking for a Microsoft desktop solution then you can use SAPI:

http://msdn.microsoft.com/en-us/magazine/cc163663.aspx

On server side, you can use Microsoft Unified Communication, but do consider licencing as well:

http://www.microsoft.com/uc/en/gb/default.aspx

Update:

This thread has also some good reference:

C# Speech Recognition - Is this what the user said?

Community
  • 1
  • 1
ShahidAzim
  • 1,446
  • 1
  • 10
  • 15
  • Just updated answer with one more link. – ShahidAzim Jan 13 '11 at 07:37
  • 1
    You don't need UCS if you only need speech recognition on Windows Server. You can download the free Microsoft Speech Platform - http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. – Michael Levy Jan 14 '11 at 21:42
  • I didn't know about that, thanks for your post and it looks interesting. – ShahidAzim Jan 15 '11 at 16:51
12

Here is a complete example using C# and System.Speech for converting from speech to text

The code can be divided into 2 main parts:

configuring the SpeechRecognitionEngine object (and its required elements) handling the SpeechRecognized and SpeechHypothesized events.

Step 1: Configuring the SpeechRecognitionEngine

_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);

At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.

Step 2: Handling the SpeechRecognitionEngine Events

_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);

_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);

private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e) { ///real-time results from the engine string realTimeResults = e.Result.Text; }

private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { ///final answer from the engine string finalAnswer = e.Result.Text; }

That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use

_speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);

instead of

_speechRecognitionEngine.SetInputToDefaultAudioDevice();

There are a bunch of different options in these classes and they are worth exploring in more detail.

http://ellismis.com/2012/03/17/converting-or-transcribing-audio-to-text-using-c-and-net-system-speech/

bulltorious
  • 7,769
  • 4
  • 49
  • 78
2

See Using c++ to call and use Windows Speech Recognition

Which says:

Microsoft provides speech recognition engines for both client and server versions of Windows. Both can be programmed with C++ or with .NET languages. The traditional API for programming in C++ is known as SAPI. The .NET framework namepsaces for client and server speech are System.Speech and Microsoft.Speech.

SAPI documentation - http://msdn.microsoft.com/en-us/library/ms723627(VS.85).aspx

The .NET namespace for client recognition is System.Speech - http://msdn.microsoft.com/en-us/library/system.speech.recognition.aspx. Windows Vista and 7 include the speech engine.

The .NET namespace for server recognition is Microsoft.Speech and the complete SDK for the 10.2 version is available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. The speech engine is a free download.

Lots of earlier questions have addressed this. See Prototype based on speech recognition , getting started with speech recognition and speech synthesis , and SAPI and Windows 7 Problem for examples.

Community
  • 1
  • 1
Michael Levy
  • 13,097
  • 15
  • 66
  • 100
-1

I'd like to use C# and .NET, but other languages will suffice. Check this if you are open to C++ Festival

Mahesh
  • 34,573
  • 20
  • 89
  • 115
-1

There is a builtIn DLL in every Windows OS for Text2Speach. You will find the according dll in c:\Programs\Shared Folders\Microsoft Shared\Speech\sapi.dll (sAPI - speach api) - I am not quite sure about the path - but in anyway you may search for sapi.dll.

Afterwards you may use the following code snippet

SpVoice oVoice = new SpVoice();
oVoice.Voice = oVoice.GetVoices("","").Item(0); // 0 indicating what kind of speaker you want
oVoice.Volume = 50;
oVoice.Speak("hello world", SpeechVoiceSpeakFlags.SVSFDefault);
oVoice = null;
Pilgerstorfer Franz
  • 8,303
  • 3
  • 41
  • 54
-1

For text to speech conversion you have to follow 3 steps:

1.Add System.Speech reference.

2.Add Headers:

using System.Speech;

using System.Speech.Synthesis;

3.Add the following code where textBox1 is a Text Box default name.

            SpeechSynthesizer speaker = new SpeechSynthesizer();
            speaker.Rate = 1;
            speaker.Volume = 100;
            speaker.Speak(textBox1.Text);
Rishi Jagati
  • 626
  • 1
  • 6
  • 28