4

I am recently working on using CMU's sphinx4 for transcription and eventually forced alignment, i.e. aligning audio with its transcript.

I found a project called AutoCap that basically did what I wanted to develop. So, I installed it but it did not work. I tried tweaking it but all I obtained was incorrect timestamps.

So, I thought of using sphinx4 and giving it a go myself. I successfully transcribed a wav file using Sphinx's Transcriber.jar file. But I could not get it working for an audio with non-digits data. The readme page states 'people who want to transcribe non-digits data should modify the config.xml file to use the correct grammar, language model, and linguist to do so'.

So, can anyone provide me some help on either of these :

  • AutoCap
  • Using Sphinx4 to transcribe non-digits data
  • Forced Alignment

Thanks.

Wilshere
  • 41
  • 2

2 Answers2

2

There is a specific project dedicated to speech to text alignment. This is not a trivial task. The development goes in a separate sphinx4 branch. You can find some details here

http://cmusphinx.sourceforge.net/?s=long+audio+alignment

If you have any question on this project you are welcome to ask on sphinx4 forum

http://sourceforge.net/projects/cmusphinx/forums/forum/382337

Nikolay Shmyrev
  • 24,897
  • 5
  • 43
  • 87
0

I am currently working on the same issue, i.e transcribing non digit data. I have looked briefly into the sphinx 4 programmers guide documentation, and used the language models, acoustic models, and the JSGF Grammar as suggested. however the response obtained was not up to the mark. What I believe is merely tweaking the parameters or changes in the config.xml alone will not suffice. I think we would need a home grown algorithm to go along with sphinx 4 which can perform better speech recognition. From my side.. i have used the lextreeliguist, JSGFGrammar and the trigram language model. But the response was not great. perhaps because the audio input was not exactly american english. Will work on it a bit more .. and let you know my results

Raveesh Sharma
  • 1,486
  • 5
  • 21
  • 38