I am thinking of developing voice recognition software for my native language and I am thinking of using CMUSphinx-4 for that. there is a CMU dictionary file which contains English words that maps with split of the original word to its phoneme boundaries. for example, ABANDONED => [ 'AH', 'B', 'AE', 'N', 'D', 'AH', 'N', 'D' ] I cannot understand the logic behind this and I want to develop an algorithm for this conversation of words. If anyone can knows an algorithm for this conversion or how this splitting is happening please share it with me.
Asked
Active
Viewed 683 times
1 Answers
1
original word to its phoneme boundaries
"Boundaries" is a wrong word here. It maps word to a phoneme sequence, nothing about boundaries
If anyone can knows an algorithm for this conversion or how this splitting is happening please share it with me.
Dictionary construction is covered in our tutorial

Nikolay Shmyrev
- 24,897
- 5
- 43
- 87