How to use mozilla deepspeech to convert speech to text using it's pre-trained model?

Question

I want to convert speech to text using mozilla deepspeech. But the output is really bad.

I have downloaded mozilla's pre trained model and then what i have done is this:



BEAM_WIDTH = 500

LM_WEIGHT = 1.50

VALID_WORD_COUNT_WEIGHT = 2.10

N_FEATURES = 26

N_CONTEXT = 9


ds = Model(model, N_FEATURES, N_CONTEXT, alphabet, BEAM_WIDTH)


fs,audio = wav.read(path)


data = audio[:,0] ## changing to mono channel (using only one channel)

prediction = ds.stt(data,fs)

print(test)

print(prediction)

Now the output is nowhere near to my audio sample. What do i have to do to increase it's accuracy?

https://stackoverflow.com/questions/53786907/why-do-the-results-of-this-deepspeech-python-program-differ-from-the-results-i-g — lamo_738, Oct 29 '19 at 04:59

score 0 · Answer 1 · answered Nov 16 '19 at 13:31

I assume it's because you are not including any LanguageModel.

The pre-trained model is basically just the acoustic model which will only transcribe the audio to similar sounding text that may not make sense.

If you combine the acoustic model with a language model (LM) you will likely get better results.

In your code example I can see the Parameter LM_WEIGHT but not any refenrence to the LM itself.

I'm unsure in which Language you want to integrate deepspeech but here is the example for node-js. This is the part where the LM is integrated

const LM_ALPHA = 0.75;
const LM_BETA = 1.85;
let lmPath = './models/lm.binary';
let triePath = './models/trie';
model.enableDecoderWithLM(lmPath, triePath, LM_ALPHA, LM_BETA);

If I'm not mistaken, the LM & Trie file is included in the pre-trained download ZIP

wget https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/deepspeech-0.5.1-models.tar.gz

Otherwise you can also create your own Language Model which would make sense if you only need the Model to recognize specific words.

How to use mozilla deepspeech to convert speech to text using it's pre-trained model?

1 Answers1