I'm using the WPF speech recognition library, trying to use it in a desktop app as an alternative to menu commands. (I want to focus on the tablet experience, where you don't have a keyboard). It works - sort of, except that the accuracy of recognition is so bad it's unusable. So I tried dictating into Word. Word worked reasonable well. I'm using my built-in laptop microphone in both cases, and both programs are capable of hearing the same speech simultaneously (provided Word retains keyboard focus), but Word gets it right and WPF does an abysmal job.
I've tried both a generic DictationGrammar() and a tiny specialised grammar, and I've tried both "en-US" and "en-AU", and in all cases Word performs well and WPF performs poorly. Even comparing the specialised grammar in WPF to the general grammar in Word, WPF gets it wrong 50% of the time e.g. hearing "size small" as "color small".
private void InitSpeechRecognition()
{
recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));
// Create and load a grammar.
if (false)
{
GrammarBuilder grammarBuilder = new GrammarBuilder();
Choices commandChoices = new Choices("weight", "color", "size");
grammarBuilder.Append(commandChoices);
Choices valueChoices = new Choices();
valueChoices.Add("normal", "bold");
valueChoices.Add("red", "green", "blue");
valueChoices.Add("small", "medium", "large");
grammarBuilder.Append(valueChoices);
recognizer.LoadGrammar(new Grammar(grammarBuilder));
}
else
{
recognizer.LoadGrammar(new DictationGrammar());
}
// Add a handler for the speech recognized event.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
// Configure input to the speech recognizer.
recognizer.SetInputToDefaultAudioDevice();
// Start asynchronous, continuous speech recognition.
recognizer.RecognizeAsync(RecognizeMode.Multiple);
}
Sample results from Word:
Hello
make it darker
I want a brighter colour
make it reader
make it greener
thank you
make it bluer
make it more blue
make it darker
turn on debugging
turn off debugging
zoom in
zoom out
The same audio in WPF, dictation grammar:
a lower
make it back
when Ted Brach
making reader
and he
liked the
ethanol and
act out
to be putting
it off the parking
zoom in
and out
I got the assembly using Nuget. I'm using Runtime version=v4.0.30319 and version=4.0.0.0. If I'm supposed to "train" it, the documentation doesn't explain how to do this, and I don't know if the training is shared with other programs such as Word, or where the training is saved. I've been playing around with it long enough now for it to know the sound of my voice.
Can anyone tell me what I'm doing wrong?