2

I am thinking of developing voice recognition software for my native language and I am thinking of using CMUSphinx-4 for that. there is a CMU dictionary file which contains English words that maps with split of the original word to its phoneme boundaries. for example, ABANDONED => [ 'AH', 'B', 'AE', 'N', 'D', 'AH', 'N', 'D' ] I cannot understand the logic behind this and I want to develop an algorithm for this conversation of words. If anyone can knows an algorithm for this conversion or how this splitting is happening please share it with me.

Cœur
  • 37,241
  • 25
  • 195
  • 267
jan
  • 53
  • 6

1 Answers1

1

original word to its phoneme boundaries

"Boundaries" is a wrong word here. It maps word to a phoneme sequence, nothing about boundaries

If anyone can knows an algorithm for this conversion or how this splitting is happening please share it with me.

Dictionary construction is covered in our tutorial

http://cmusphinx.sourceforge.net/wiki/tutorialdict

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87