I want to convert speech to text using mozilla deepspeech. But the output is really bad.
I have downloaded mozilla's pre trained model and then what i have done is this:
BEAM_WIDTH = 500
LM_WEIGHT = 1.50
VALID_WORD_COUNT_WEIGHT = 2.10
N_FEATURES = 26
N_CONTEXT = 9
ds = Model(model, N_FEATURES, N_CONTEXT, alphabet, BEAM_WIDTH)
fs,audio = wav.read(path)
data = audio[:,0] ## changing to mono channel (using only one channel)
prediction = ds.stt(data,fs)
print(test)
print(prediction)
Now the output is nowhere near to my audio sample. What do i have to do to increase it's accuracy?