Comparison of Speech Recognition use in Android: by Intent or on-thread?

Question

Introduction

Android provides two ways for me to use speech recognition.

The first way is by an Intent, as in this question: Intent example. A new Activity is pushed onto the top of the stack which listens to the user, hears some speech, attempts to transcribes it (normally via the cloud) then returns the result to my app, via an onActivityResult call.

The second is by getting a SpeechRecognizer, like the code here: SpeechRecognizer example. Here, it looks like the speech is recorded and transcribed on some other thread, then callbacks bring me the results. And this is done without leaving my Activity.

I would like to understand the pros and cons of these two ways of doing speech recognition.

What I've got so far

Using the Intent:

is simple to code
avoids reinventing the wheel
gives consistent user experience of speech recognition across the device

but

might be slow for the creation of a new activity with it's own window

Using the SpeechRecognizer:

lets me retain control of UI in my app
gives me extra possibilities of things to respond to (documentation)

but

is limited to be called from the main thread
more control requires more error-checking.

That's odd. Why have people downvoted this question? And why not give me some feedback? — hcarver, Aug 14 '12 at 17:28
And they seem to have downvoted both answers too. Where's the love? — hcarver, Aug 16 '12 at 12:38
I think both approaches have the problem of it being slow to start. There is a delay between when the app wants speech recognition to start and when it does. At least with the Intent approach the user is aware with various dialogs, With speechrecognizer, you'd have to tell the user some other way. — gregm, Aug 17 '12 at 09:28
No, there is no alternative to the slow start. What's worse, is that the start time is variable. Sometimes is it fast other times slow. You have to do something with the UI to help or maybe Google will solve it by adding some helpful beeps. — gregm, Aug 20 '12 at 11:20
Hi, can you share some example code for this? How you solve this ? — Hardik Joshi, May 02 '14 at 07:00

gregm · Accepted Answer · 2012-08-17T09:26:46.050

In addition to all this, I'd add at least this point:

SpeechRecognizer is better for hands-free user interfaces, since your app actually gets to respond to error conditions like "No matches" and perhaps restart itself. When you use the Intent, the app beeps and shows a dialog that the user must press to continue.

My summary is as follows:

SpeechRecognizer

Show different UI or no UI at all. Do you really want your app's UI to beep? Do you really want your UI to show a dialog when there is an error and wait for user to click?
App can do something else while speech recognition is happening
Can recognize speech while running in the background or from a service
Can Handle errors better
Can access low level speech stuff like the raw audio or the RMS. Analyze that audio or use the loudness to make some kind of flashing light to indicate the app is listening

Intent

Consistent, and easy to use UI for users
Easy to program

is one more accurate than the other? Or is the Intent version just a UI for SpeechRecognizer? — user13267, Jan 06 '18 at 06:16

score 2 · Answer 2 · answered Aug 13 '12 at 10:47

2

The main difference is UI. SpeechRecognizer doesn't have any so you are responsible for creating one.
I use to wrote a prototype where I've have receiver for listening headset button, then activating speech recognition to listen for some commands. Screen was not activated so I had to use SpeechRecognizer (my UI was some prerecorded sounds and Text To Speech).

Second difference is that SpeechRecognizer has ability for constant listening. Intent version will always end exaction after some period. For example SpeechRecognizer is used by speech recognition "keyboard" so you can dictate a SMS.
In such case you will receive partial results only (in normal mode SpeechRecognizer gives only final results).

answered Aug 13 '12 at 10:47

Marek R

32,568
6
55
140

1

Incorrect. Both SpeechRecognizer and Intent end after a certain controllable period. With SpeechRecognizer at least you can restart it though. – gregm Aug 17 '12 at 09:21
You are both incorrect. ;) It all depends on the app that provides the speech recognition service. It can run as long as it wants. In case of `SpeechRecognizer` the caller can call `cancel` on it but not control the recording time. In case of the Intent the control is handed over to a new Activity which might finish only if the user presses BACK. – Kaarel Aug 23 '12 at 07:15

score 1 · Answer 3 · answered Aug 23 '12 at 07:30

One thing that the other answers have not mentioned: if multiple speech recognizers are installed on the device then user switching between them is different depending on if "Intent" or the SpeechRecognizer is used.

In case of "Intent" the standard Activity selection dialog is popped up. The user can choose the recognizer to be used, and optionally set it globally as the default recognizer, to avoid the dialog in the future.
In case of SpeechRecognizer the user can set and configure the default recognizer in the global settings (Language and input -> Voice recognizer on ICS).

So, depending on which interface is used the documentation about setting the default recognizer and switching between recognizers should be different. (In most cases though there is just one recognizer, Google Voice Search, so this might not be a big issue in practice.)

Comparison of Speech Recognition use in Android: by Intent or on-thread?

Introduction

What I've got so far

3 Answers3

Linked