3

I used Openears which needs dictionary. It is usefull when we mention the word in dictionary. I wanted to convert all words we speak. So I used Nuance’s speech to recognition dragaon SDK. But it communicates with webserver. I want to avoid server communication because of security concerns. Is it possible to convert speech to text for all words we speak as it is in windows mobile without communicating server only in offline mode?

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87

3 Answers3

2

Speech recognition with unlimited vocabulary requires very big computational and memory resources (gigabytes of memory) and thus it's very hard to do that in iPhone on other embedded device. iPhone is 9 times slower than desktop. iPad is easier since it has more powerful CPU.

Google has put very big effort to make their engine work offline for dictation, and still it prefers to send data to the server because it is significantly more accurate.

Because of that most of the solutions running on small devices use limited vocabulary. Though this vocabulary can be large enough so you will not notice that. Usually 500-1000 words is enough to cover most practical situations. You can use OpenEars to recognize such vocabulary.

To train a language model you need texts from your domain (words and expressions). Language model training is described in CMUSphinx tutorial. To use language model you can use the following OpenEars API call:

- (void) changeLanguageModelToFile:     (NSString *)    languageModelPathAsString
withDictionary:     (NSString *)    dictionaryPathAsString 

See API reference for more details.

You can use OpenEars with such vocabulary and corresponding language model to support free form text entry for your device.

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
0

It could be done, but if you are looking for an unlimited vocabulary speech to text convertor, then it is best if the computations are done on a server. The requirements for such a system are probably too great for a system such as a smartphone. The main areas where you will have huge requirements are as follows:

  1. Dictionary to map input speech into text.
  2. Computations for speech recognition algorithms to run.

I believe this is the reason why companies like Google run their speech recognition services over a server and not on the phone.

But if the application was a limited word speech to text, then it might be worth giving it a try.

All the best!

Sriram
  • 10,298
  • 21
  • 83
  • 136
  • is there any application in appstore which uses offline mode?doesvgoogle search iphone application communicate with server? –  Jul 21 '11 at 08:24
  • yes. google voice search application on every smart phone does communicate with a server. i am not familiar with the app store and have not heard of any application that does what you want on the phone. but do not take my word for it. you should do a little research on this. – Sriram Jul 21 '11 at 08:51
0

Doesn't pocketsphinx work on iPhone without network connectivity? Aren't there some demo apps floating around like VocalKit

http://www.rajeevan.co.uk/pocketsphinx_in_iphone/ may be helpful.

Michael Levy
  • 13,097
  • 15
  • 66
  • 100
  • levy,pocketsphinx needs dictionary,we have to mention the word in coding for our speaking.will it work without dictionary? –  Jul 22 '11 at 03:31
  • Sorry, I was focusing on the "no network" part of you question and not the "no dictionary". – Michael Levy Jul 22 '11 at 13:04