3

I am new to DeepSpeech i followed this link to create Speech to text code, but my results are no where near to the original speech. I am using Deepspeech 0.6.1 and have installed the relevant pretrained model. I am using this link to create my wav file with default options. Below is my code.

import numpy as np
import wave
from deepspeech import Model
from scipy.io import wavfile as wav
import speech_recognition as sr

audio_file = "D:/Dataset/DeepSpeech/converted_stt1.wav"
ds = Model('D:/Dataset/DeepSpeech/deepspeech-0.6.1-models/models/output_graph.pbmm',500)
ds.enableDecoderWithLM('D:/Dataset/DeepSpeech/deepspeech-0.6.1-models/models/lm.binary','D:/Dataset/DeepSpeech/deepspeech-0.6.1-models/models/trie', 0.75, 1.85)
rate, audio = wav.read(audio_file)
print(audio)
transcript =ds.stt(audio)
print(transcript)

I am suspecting that this issue because of my audio format or something. Please help me with this issue how can i make the most of deepspeech library.

Ironman
  • 1,330
  • 2
  • 19
  • 40
  • Are you using your own recording in the file. have you done noise removal in it. –  Sep 27 '20 at 12:18
  • Same issue with me. it gives good results on the wav audio files provided on the project page but not on the live recordings. –  Sep 27 '20 at 12:19
  • it gives good results for noise reduction files. – Chandan Feb 17 '21 at 13:35

2 Answers2

1

I am also using Deepspeech v0.6.1

one thing i notice this is problem with

from scipy.io import wavfile as wav 

Because when i was run same file using client.py provided by Mozilla Deepspeech result are change

client file link client.py

0

You will need to specify the audio sample rate of your input. Otherwise it will assume its of the same sample rate the model you are using was trained on.

You can also get your model's default rate but calling ds.sampleRate() and format your input audio to be of the same rate. https://deepspeech.readthedocs.io/en/v0.6.1/Python-API.html#native_client.python.Model.sampleRate

Henderz
  • 31
  • 5