4

The error I'm getting is:

FileNotFoundError: [Errno 2] SoX not found, use 16000hz files or install it: The system cannot find the file specified

My audio file is from an mp4 video file that was then converted to a WAV file from VLC. The sampling rate is 8000 Hz by default, and it appears that DeepSpeech needs 16kHz files and therefore the SoX plugin.

I ran pip install SoX and pip install --upgrade SoX.
Requirement already satisfied: SoX in e:\downloads\deep speech\lib\site-packages (1.4.1)
Requirement already satisfied: numpy>=1.9.0 in e:\downloads\deep speech\lib\site-packages (from SoX) (1.21.4)
So it's here. I then added E:\Downloads\Deep Speech\Lib\site-packages to system environment variables on Windows just in case. I'm new to Python in general and stumped here.

Could someone give me a hand?

Pawara Siriwardhane
  • 1,873
  • 10
  • 26
  • 38
Vendolheim
  • 51
  • 2
  • 3
    Sox isn't a Python package. You'll need to have the program itself installed. – AKX Nov 17 '21 at 01:34
  • Okay, great. I've got it working now. I don't know if I should make another post (because it's a different issue), but the transcription is really awful at the moment. Is that the current state of DeepSpeech or can I use a better model for training? I currently run 'deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models.scorer --audio Test.wav > output.txt' – Vendolheim Nov 17 '21 at 16:56

2 Answers2

5

I faced the same issue, Fix it by converting the audio rate to 16000hz. Please try

`ffmpeg -i input.wav -ar 16000 output.wav`
talha ai
  • 61
  • 1
  • 3
3

You should install sox from apt-get ubuntu, not from pip.

sudo apt-get install sox 
Milad
  • 335
  • 1
  • 2
  • 13