InApp voice-triggered controlling and offline SpeechRecognition on Android ICS

Question

I am currently developing a crossplatform app, this should run on a Google GLASS (Android 4.0.4), a smartphone (Android 4.0.4 or newer) and another wearables. At least it will be ICS – Ice Cream Sandwich version.
This app provides me with event-driven different Views, triggered by the user or the system (Network - Event).
For the controlling by the user, I want to implement speech recognition, which just needs to recognize numbers or at least single digits and the commands forward and backward. It is important that it also works offline, it should work in background when the application is running and shouldn’t cover the user interface.
Related Work:
SpeechRecognizer seems to have the offline functionality only with jellybean, (haven’t found a way to use it on Android 4.0.4).
Implementing a custom IME and the use of VoiceTyping seems to me to be very expensive and dirty. (like Utter!, btw. really nice work!)
First attempts to use pocketsphinx haven’t been successful yet.

for offline speech recognition on icecream sandwich you can try recently updated pocketsphinx demo for android http://cmusphinx.sourceforge.net/wiki/tutorialandroid — Nikolay Shmyrev, Jan 14 '14 at 15:52
I have already tried this demo, the app is installed, but the commands are not recognized ... I have not yet figured it out why. in logcat I can see no error, everything seems to load ... — AlexejWagner.java, Jan 15 '14 at 07:32
The demo creates raw files in /mnt/sdcard/data/Android/edu.cmu.pocketsphinx, share them. — Nikolay Shmyrev, Jan 15 '14 at 11:17
@NikolayShmyrev I'm sure you mean the /mnt/sdcard/Android/data/edu.cmu.pocketsphinx.demo/files dir... [raw files](https://www.dropbox.com/sh/d5yu2tfrcousrt0/uCA3na2C7m) thx — AlexejWagner.java, Jan 15 '14 at 12:03
Can you share logcat too? Also are you using the version from yesterday? It was broken before but was updated just yesteray. — Nikolay Shmyrev, Jan 15 '14 at 14:10
@NikolayShmyrev Yeah! I haven't seen, that there is an update. I have just updated the project and it seems to work. But I'm still seeing one problem here, some background-noise is seen as words from the dictionary. It is simply taken something. I will try to make it more sensitive, but I think that I still need something else... Thank you for the suggestions. спасибо! — AlexejWagner.java, Jan 15 '14 at 14:50

score 2 · Accepted Answer · answered Jan 16 '14 at 09:20

The offline voice capabilities of Jelly Bean are handled by the Google Search application internally. There has been no change to either the RecognizerIntent or the SpeechRecognizer API.

This isn't ideal for what you want to achieve, as having a dependency to a closed sourced application that isn't cross platform will throw a spanner in the works.... Regardless of that, a simple offline = true parameter is nowhere to be seen and you'll end up having to coerce this behaviour. I have requested this parameter by the way!

Google handle their wake up phrase with a dedicated processor core, but it looks unlikely that the manufacturers intend to expose this functionality to anyone other than OEMs.

That leaves other alternative recognition providers, that have RESTful services, such as iSpeech, AT&T and Nuance, but again, you'll be murdering the battery and using significant data if you take this approach. Not to mention the audio conflicts that occur on the Android platform.

Finally, you end up with Sphinx. At present, I consider it the only viable solution to lower the resource usage, but it doesn't get around the audio conflict issues. I've been working on getting it running within my application for a long time, but I still have major issues with false positives that have stopped me including it in production.

It is probably your only option until Google, processor manufactures and OEMs work out how to offer such functionality, without every application installed on the device wanting a piece of the action, which is inevitable.....

I'm not sure this response actually provided and answer, more excludes some!

Good luck

EDIT: In an environment of wearables, such products will have access to the dedicated cores - at least they need to make sure they do and use a processor with such capabilities. From my interaction with companies developing such tech, they often overlook this or are unaware of its necessity.

Thanks for the comprehensive overview, I'll try with PocketSphinx lib to implement it probably. I see that you already have a lot of experience in this case. So I mark this answer as accepted, and I hope that soon there will come an better offline solution. — AlexejWagner.java, Jan 16 '14 at 11:38

score 1 · Answer 2 · edited May 23 '17 at 12:00

1

I want to propose a partial answer to your question. Since you want the speech recognition not to interfere with the UI, you could create a Service, with it you can make it a continuous speech recognizer, avoid the graphical widget and avoid the "beep" sound. I used the following way and worked fine for me: Android Speech Recognition Continuous Service

edited May 23 '17 at 12:00

Community

1
1

answered Jan 14 '14 at 11:15

DiegoSahagun

730
1
11
22

InApp voice-triggered controlling and offline SpeechRecognition on Android ICS

2 Answers2