3

I have a Windows 10 UWP app that I am enabling voice recognition for a text box. Yes, I know that I can also leverage Cortana for this. However, Cortana comes with some cons as well, mainly that you have little to no control over Cortana from within the app.

This is where the Continuous Recognition of the SpeechRecognizer namespace comes in. I like the amount of control I have. However, it seems to randomly stop listening after some seconds.

Here is how I have it implemented. Note that I also tried to set every possible timeout to 0 which should mean no timeout.

Properties on page:

private SpeechRecognizer speechRecognizer;
private CoreDispatcher dispatcher;

OnLoaded for the page:

speechRecognizer = new SpeechRecognizer();
speechRecognizer.Timeouts.BabbleTimeout = TimeSpan.FromSeconds(0);
speechRecognizer.Timeouts.InitialSilenceTimeout = TimeSpan.FromSeconds(0);
speechRecognizer.Timeouts.EndSilenceTimeout = TimeSpan.FromSeconds(0);
speechRecognizer.ContinuousRecognitionSession.AutoStopSilenceTimeout = TimeSpan.FromSeconds(0);

SpeechRecognitionCompilationResult result = await speechRecognizer.CompileConstraintsAsync();
speechRecognizer.ContinuousRecognitionSession.ResultGenerated += ContinuousRecognitionSession_ResultGenerated;
speechRecognizer.StateChanged += SpeechRecognizer_StateChanged;

Then, when I click a button to start listening, I do this to start:

if (speechRecognizer.State == SpeechRecognizerState.Idle)
{
     await speechRecognizer.ContinuousRecognitionSession.StartAsync();
}

Finally, I listen to the two event handlers above, for ResultGenerated and StateChanged. I have breakpoints set in those two events. When the page loads, everything is instantiated just fine. When I click the button to start listening, it does start just fine as well and I see the StateChanged event handler fire to show it is listening. However, if I let the app sit idle (no speaking) for a few seconds (and the amount of seconds seems random, can be anywhere between 2-5 seconds), the StateChanged event will fire and say it is idle again. After that, I cannot get the ResultGenerated event to fire when I try speaking which further shows it is not listening anymore.

I can click the button to start listening again and it will, but with the same random stopping again.

Also, if I do speak right away, after I click the button, the speech recognition does work just fine.

What I want to happen is when you click the button, I want it to listen indefinitely, until I call StopAsync and tell it to stop. Anybody know why it just stops on its own???

UPDATE- I added the event handler for completed:

speechRecognizer.ContinuousRecognitionSession.Completed += ContinuousRecognitionSession_Completed;

Because this would give me a status in args.Status and I put a breakpoint there. The funny thing is, this breakpoint will hit in the 2-5 seconds when continuous recognition stops and it gives a status of "SUCCESS" even though I didn't speak anything and the ResultGenerated event never fired with a result. So, how is it getting a success with no result? And why is this causing it to stop?

Thanks!

Michael Bedford
  • 1,742
  • 20
  • 48
  • My crystal ball says that your ResultsGenerated event handler throws an exception. There is no decent mechanism to forward such a mishap to your UI, use try/catch to make sure. – Hans Passant May 10 '18 at 17:00
  • @HansPassant Thank you for the thought, I added a try/catch on the ResultsGenerated event but there is no exception. In fact, the event never even raises because I am not speaking. I did find something else odd, I added it as an update above. I added the Completed event and found that it raises with "SUCCESS" even though I didn't say anything and don't get a result. This is when it stops listening. – Michael Bedford May 10 '18 at 18:49
  • I just found more details. It is network dependent so if there is no network, it will fail to start and I believe if network drops, it will cause it to stop with a Status of "NetworkFailure". This might explain the intermittent stopping. Secondly, I found that if I am on LAN connection, it is not so intermittent. It will stop still but I am finding that it consistently stops after 15 seconds now that network is not a question. So, I understand things better except why does it stop after 15 seconds if I set all possible timeouts on infinite? – Michael Bedford May 10 '18 at 18:57
  • Cannot reproduce your issue. The recognition will not stop after 15 seconds, even more. And by the way, as @HansPassant mentioned, actually your `ResultsGenerated` event handler did throw an exception, since `dispatcher` is not instanced. Or you didn't provide your whole code snippet. Please provide a [mcve] to let me test on my side. – Sunteen Wu May 11 '18 at 09:07
  • @Michael Bedford Hello, I've similar problem described here [UWP speech recognition failure requires restart with foreground and timeout](https://stackoverflow.com/questions/57531162/uwp-speech-recognition-failure-requires-restart-with-foreground-and-timeout). Did you use the suggestion from answer given to your question? If so, It would be very helpful to know how you solved this task, could you show some guide or example which helped you figure out with this problem – lf80 Aug 17 '19 at 03:13

2 Answers2

1

So I had the same problem and came across this question. I think I figured it out eventually. The problem is that when the UWP app goes from the foreground (like switching to another app) the Speech Recognizer will stop (without any event).

When debugging this will, of course, happen when you set breakpoints. I think the problem is fixed by restarting the SpeechRecognizer when it goes into the foreground again.

Homde
  • 4,246
  • 4
  • 34
  • 50
  • Hello,I have same problem here [UWP speech recognition failure requires restart with foreground and timeout](https://stackoverflow.com/questions/57531162/uwp-speech-recognition-failure-requires-restart-with-foreground-and-timeout) and [Send speech recognition args.Result as parameter in UWP desktop-bridge package](https://stackoverflow.com/questions/56961757/send-speech-recognition-args-result-as-parameter-in-uwp-desktop-bridge-package). Can you show, how to restart the `ContSpeechRecognizer` when it goes into the foreground, please – lf80 Aug 17 '19 at 03:20
0

I have an answer for your problem. After almost 2 weeks of struggle I finally found the root of the problem. The continuous speech recognition and the normal speech recognition, unfortunately , it doesn't have many means of detecting its operational states. My advice is to use conditional recursion for the continuous speech recognition. Here is an example:

Code:

                                speechRecognizer = new Windows.Media.SpeechRecognition.SpeechRecognizer();
                                await speechRecognizer.CompileConstraintsAsync();
                                speechRecognizer.ContinuousRecognitionSession.AutoStopSilenceTimeout = TimeSpan.FromMilliseconds(0);
                                speechRecognizer.ContinuousRecognitionSession.ResultGenerated += ContinuousRecognitionSession_ResultGenerated;
                                speechRecognizer.ContinuousRecognitionSession.Completed += ContinuousRecognitionSession_Completed;
                                speechRecognizer.StateChanged += SpeechRecognizer_StateChanged;
                                await speechRecognizer.ContinuousRecognitionSession.StartAsync();

//This is inside the ContinuousRecognitionSession_Completed event.

       speechRecognizer.Dispose();


                speechRecognizer = new Windows.Media.SpeechRecognition.SpeechRecognizer();
                await speechRecognizer.CompileConstraintsAsync();
                speechRecognizer.ContinuousRecognitionSession.AutoStopSilenceTimeout = TimeSpan.FromMilliseconds(0);
                speechRecognizer.ContinuousRecognitionSession.ResultGenerated += ContinuousRecognitionSession_ResultGenerated;
                speechRecognizer.ContinuousRecognitionSession.Completed += ContinuousRecognitionSession_Completed;
                speechRecognizer.StateChanged += SpeechRecognizer_StateChanged;
                await speechRecognizer.ContinuousRecognitionSession.StartAsync();
            }
            catch { }   

What this means is that you re-initiate the continuous speech recogniser each time it finishes the speech recognition session, which translates as completion in the functions' terms.

Another thing that you have in mind is that the Windows.Media.SpeechRecognition.SpeechRecognizer pauses the recognition session each time the core window losses focus. This can't be remediated with the Activate() method. This problem can be remediated only with the IsEnabled propriety. My advice is to use a timer that at a specified time interval will set the IsEnabled propriety to true.

Another major advice is to migrate the Windows.Media.SpeechRecognition API into a WPF app because it gives you a much larger degree of control over the app's functionality. The API migration is done by downloading the NuGet package called Microsoft.Windows.SDK.Contracts.

teodor mihail
  • 343
  • 3
  • 7