18

I've managed to finally build and run pocketsphinx (pocketsphinx_continuous). The problem I'm running into, is how to a improve accuracy. From what I understand, you can specify a dictionary file (-dict test.dic). So I took the default dictionary file and added some more pronunciations of the same words, for example:

pencil P EH N S AH L
pencil(2) P EH N S IH L

spaghetti S P AH G EH T IY
spaghetti(2) S P UH G EH T IY

Yet pocketsphinx still does not recognize either word at all. I know there is a jsgf file you can specify as well , but that seems more for phrases and grammar. How can I get pocketsphinx to recognize common words such as pencil and spaghetti?

thanks

-Mike

f3lix
  • 29,500
  • 10
  • 66
  • 86
Mike6679
  • 5,547
  • 19
  • 63
  • 108
  • Anyone? anyone?................ – Mike6679 Dec 29 '10 at 18:37
  • Hi Mike, Glad to find someone can build and run pocketsphinx on Android. I want to do the same thing and I am having problem to build "PocketSphinxAndroidDemo" downloaded from cmusphinx.sourceforge.net. Could you share your experience and list the steps on how you did it? What's "pocketsphinx_continuous"? Is that a different branch of pocketsphinx? Thanks! gwofu – user602410 Feb 04 '11 at 01:04
  • user602410: pocketsphinx_continuous is a program included with the pocketsphinx distribution. – Jeremy Salwen May 18 '11 at 04:02

6 Answers6

10

With something like this, you can't be certain, but I can offer the following suggestions:

  1. Perhaps the language model somehow has low probabilities for "spaghetti" and "pencil". As you suggested, you could use a JSGF to test out how it does for recognition if it doesn't use the N-gram models, but instead does a simple grammar (give it like twenty words, including spaghetti and pencil). This way you can see if it is perhaps the language model which makes it difficult to recognize these words, and it can do okay if it considers all the words to have equal probability.

  2. Perhaps you simply pronounce these words poorly, even with the alternative dictionary entries. Try either A. Testing other peoples' voices, or B. Adapting the acoustic model to your voice (see http://cmusphinx.sourceforge.net/wiki/tutorialam)

  3. Also, what is it recognizing them as when it is failing? If possible, remove the words it misrecognizes as from the dictionary.

Again, for overall accuracy, only three things are going to really help you: restricting the grammar, adapting the accoustic model, and perhaps getting higher quality recording input.

Jeremy Salwen
  • 8,061
  • 5
  • 50
  • 73
7

To improve accuracy you may want to try adapting the acoustic model to your voice. http://cmusphinx.sourceforge.net/wiki/tutorialadapt

To learn how to add new words: http://ghatage.com/tech/2012/12/13/Make-Pocketsphinx-recognize-new-words/

Anup
  • 99
  • 1
  • 4
  • The link to learn how to add new words gives a 404. Do you know if we can find it anywhere else? – Edu Zamora Aug 19 '14 at 09:28
  • Not to revive a dead thread, but the URL appears to be: http://ghatage.com/2012/12/13/Make-Pocketsphinx-recognize-new-words/ – OldWolf Sep 21 '15 at 15:07
  • The correct link is - http://www.ghatage.com/tech/2012/12/13/Make-Pocketsphinx-recognize-new-words. Seems like the permalink structure was changed. – rajagrawal Sep 14 '16 at 06:26
3

Make sure you put a tab (not a space) after the word and before the start of the pronunciation.

2

May be the problem is with Pocketsphinx. I too was not getting good results with Pocketsphinx. But I was getting very good accuracy with Sphinx4 (for a US speaker with a noise-cancelling microphone.) Therefore I did a comparison between the two using the same audio recordings. For pocketsphinx I used pocketsphinx_batch with the WSJ audio model and a small vocabulary language model and dictionary (created online with the CMU Cambridge language modelling toolkit.) For Sphinx4 I wrote a small Java program using the Sphinx4 library. The result was that Sphinx4 was much more accurate. All the gory details are at http://www.jaivox.com/pocketsphinx.html.

vjaivox
  • 41
  • 2
1

To achieve good accuracy with a pocketshinx:

  • Important! Check that your mic, audio device, file supports 16 kHz while the general model is trained with 16 kHz acoustic examples.
  • You should create your own limited dictionary you cannot use cmusphinx-voxforge-de.dic while accuracy is dramatically dropped.
  • You should create your own language model.

You can search for Jasper project on GitLab to see how it's implemented. Also, please check the documentation

Ievgen
  • 4,261
  • 7
  • 75
  • 124
0

This is on the CMUSphinx website

"There are various phonesets to represent phones, such as IPA or SAMPA. CMUSphinx does not yet require you to use any well-known phoneset, moreover, it prefers to use letter-only phone names without special symbols. This requirement simplifies some processing algorithms, for example, you can create files with phone names as part of the filenames without any violating of the OS filename requirements.

A dictionary should contain all the words you are interested in, otherwise the recognizer will not be able to recognize them. However, it is not sufficient to have the words in the dictionary. The recognizer looks for a word in both the dictionary and the language model. Without the language model, a word will not be recognized, even if it is present in the dictionary." https://cmusphinx.github.io/wiki/tutorialdict/