I am developing a c# application using Kinect that relies on voice input to do things. I have a list of Arabic words that the user can say to select different menu items.
I have been searching over the past few days with little success. Things I found:
CMU Sphinx: http://www.ccse.kfupm.edu.sa/~elshafei/AASR.htm The first problem with this is that it is java based. I have looked at KVM and the bridge one but I couldn't get too far with this thing. I couldn't set it up to work in Java. There are no steps on how to use the already prepared files.
I have also looked at using an SRGSdocument as suggested by this link Specifying a pronunciation of a word in Microsoft Speech API but this is too complicated for my purposes and I don't even know if it is what I need.
I have also looked at Microsoft Speech Recognition Custom Training The person's problem was similar but I cannot solve my problem the same way.
I cannot use a commercial application such as Sakhr because I do not have the budget for it. Simply adding words to a grammar will not work because these words don't obey normal pronunciation rules of the English Language.
Basically, what I'm looking for is some sort of tool that can connect a word written in English with a set of different pronunciations coming from a microphone (as in pretrained) and that then can be referenced by the Speech engine during run time. Is this possible?
I am open to any options.
Thanks.