Voice/Speech to text

Question

I need an API or library (preferably free) that will convert voice/speech through a microphone, into text (string).

Additionally, I will need an API or library that can do text-to-speech.

I'd like to use C# and .NET, but other languages will suffice.

Thanks.

score 16 · Answer 1 · edited May 23 '17 at 11:54

16

You can use CMU Sphinx as it is pretty open and scalable solution and I think it can be used at both client and server side:

http://cmusphinx.sourceforge.net/

If you are looking for a Microsoft desktop solution then you can use SAPI:

http://msdn.microsoft.com/en-us/magazine/cc163663.aspx

On server side, you can use Microsoft Unified Communication, but do consider licencing as well:

http://www.microsoft.com/uc/en/gb/default.aspx

Update:

This thread has also some good reference:

C# Speech Recognition - Is this what the user said?

edited May 23 '17 at 11:54

Community

1
1

answered Jan 13 '11 at 07:28

ShahidAzim

1,446
1
10
15

Just updated answer with one more link. – ShahidAzim Jan 13 '11 at 07:37
1

You don't need UCS if you only need speech recognition on Windows Server. You can download the free Microsoft Speech Platform - http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. – Michael Levy Jan 14 '11 at 21:42
I didn't know about that, thanks for your post and it looks interesting. – ShahidAzim Jan 15 '11 at 16:51

score 12 · Answer 2 · answered Mar 17 '12 at 17:13

Here is a complete example using C# and System.Speech for converting from speech to text

The code can be divided into 2 main parts:

configuring the SpeechRecognitionEngine object (and its required elements) handling the SpeechRecognized and SpeechHypothesized events.

Step 1: Configuring the SpeechRecognitionEngine

_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);

At this point your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.

Step 2: Handling the SpeechRecognitionEngine Events

_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);

_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized); _speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);

private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e) { ///real-time results from the engine string realTimeResults = e.Result.Text; }

private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { ///final answer from the engine string finalAnswer = e.Result.Text; }

That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use

_speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile);

instead of

_speechRecognitionEngine.SetInputToDefaultAudioDevice();

There are a bunch of different options in these classes and they are worth exploring in more detail.

http://ellismis.com/2012/03/17/converting-or-transcribing-audio-to-text-using-c-and-net-system-speech/

can we use SpeechRecognitionEngine for android applications using MONO Framework??? — Awais Tariq, Jun 20 '12 at 07:47
@bulltorious Incorrect. Mono is a Linux emulator for .NET, so yes, this will work with Mono. *Also: Visual Studio is an IDE, not a programming language.* — AStopher, Apr 03 '17 at 16:25

score 2 · Answer 3 · edited May 23 '17 at 12:17

See Using c++ to call and use Windows Speech Recognition

Which says:

Microsoft provides speech recognition engines for both client and server versions of Windows. Both can be programmed with C++ or with .NET languages. The traditional API for programming in C++ is known as SAPI. The .NET framework namepsaces for client and server speech are System.Speech and Microsoft.Speech.

SAPI documentation - http://msdn.microsoft.com/en-us/library/ms723627(VS.85).aspx

The .NET namespace for client recognition is System.Speech - http://msdn.microsoft.com/en-us/library/system.speech.recognition.aspx. Windows Vista and 7 include the speech engine.

The .NET namespace for server recognition is Microsoft.Speech and the complete SDK for the 10.2 version is available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. The speech engine is a free download.

Lots of earlier questions have addressed this. See Prototype based on speech recognition , getting started with speech recognition and speech synthesis , and SAPI and Windows 7 Problem for examples.

score -1 · Answer 4 · answered Jan 13 '11 at 06:48

-1

I'd like to use C# and .NET, but other languages will suffice. Check this if you are open to C++ Festival

answered Jan 13 '11 at 06:48

Mahesh

34,573
20
89
115

score -1 · Answer 5 · answered Jan 13 '11 at 07:01

-1

There is a builtIn DLL in every Windows OS for Text2Speach. You will find the according dll in c:\Programs\Shared Folders\Microsoft Shared\Speech\sapi.dll (sAPI - speach api) - I am not quite sure about the path - but in anyway you may search for sapi.dll.

Afterwards you may use the following code snippet

SpVoice oVoice = new SpVoice();
oVoice.Voice = oVoice.GetVoices("","").Item(0); // 0 indicating what kind of speaker you want
oVoice.Volume = 50;
oVoice.Speak("hello world", SpeechVoiceSpeakFlags.SVSFDefault);
oVoice = null;

answered Jan 13 '11 at 07:01

Pilgerstorfer Franz

8,303
3
41
54

This is the path for Win7 C:\Windows\System32\Speech\Common – Serkan Hekimoglu Jan 13 '11 at 07:36
1

Speech to Text... not the other way around. – Rob Hay Sep 12 '11 at 07:32
1

@RobHay "Additionally, I will need an API or library that can do text-to-speech" so I think my answer is - at least - partially correct. – Pilgerstorfer Franz Jul 28 '12 at 12:12

score -1 · Answer 6 · answered Feb 06 '12 at 09:23

-1

For text to speech conversion you have to follow 3 steps:

1.Add System.Speech reference.

2.Add Headers:

using System.Speech;

using System.Speech.Synthesis;

3.Add the following code where textBox1 is a Text Box default name.

            SpeechSynthesizer speaker = new SpeechSynthesizer();
            speaker.Rate = 1;
            speaker.Volume = 100;
            speaker.Speak(textBox1.Text);

answered Feb 06 '12 at 09:23

Rishi Jagati

626
1
6
28

5

Except it is going the opposite direction from the question; instead of speech-to-text, this shows text-to-speech. – B. Clay Shannon-B. Crow Raven Apr 12 '16 at 15:36

Voice/Speech to text

6 Answers6

Linked