I’m trying to setup Whisper Speech to text, but I'm having some trouble. After running the script I get a Traceback, which does not really give me a clue. At the end I do get:
FileNotFoundError: [WinError 2] The system cannot find the file specified
I tried a number of path combination and checked if the file exists, and installed ffmpeg, but nothing works. I don't a lot of experience with python and this seems to be a familiar problem online, but I have not found a solution so far.
Script:
import os
import whisper
file_path = os.path.normcase(r'jfk.wav')
file_path2 = 'C:/Users/me/Downloads/jfk.wav'
file_path3 = 'C:\\Users\\me\\Downloads\\jfk.wav'
#file_pathpath4 = 'C:\Users\me\Downloads\jfk.wav'
file_path5 = 'jfk.wav'
print("Hello Whisper")
def speech_to_text(audio_file):
if os.path.isfile(audio_file):
print('File exists')
model = whisper.load_model("base")
result = model.transcribe(audio_file, verbose=True)
result['text']
else:
print('File does not exist')
speech_to_text(file_path5)
This is the full traceback:
C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\timing.py:58: NumbaDeprecationWarning: [1mThe 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.[0m
def backtrace(trace: np.ndarray):
Hello Whisper
File exists
C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
File "c:\Users\me\Local\Cookbook\CookbookCpp\EmbedPython\python\Whisper.py", line 21, in <module>
speech_to_text(file_path5)
File "c:\Users\me\Local\Cookbook\CookbookCpp\EmbedPython\python\Whisper.py", line 16, in speech_to_text
result = model.transcribe(audio_file, verbose=True)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\transcribe.py", line 121, in transcribe
mel = log_mel_spectrogram(audio, padding=N_SAMPLES)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\audio.py", line 130, in log_mel_spectrogram
audio = load_audio(audio)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\whisper\audio.py", line 46, in load_audio
ffmpeg.input(file, threads=0)
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\ffmpeg\_run.py", line 313, in run
process = run_async(
File "C:\Users\me\AppData\Roaming\Python\Python39\site-packages\ffmpeg\_run.py", line 284, in run_async
return subprocess.Popen(
File "C:\Program Files\Python39\lib\subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files\Python39\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified