I have a large audio file that I would like to get transcribed. For this, I opted the silence-based conversion by splitting the audio file into chunks based on the silence between sentences. However, this takes longer than expected even for a short audio file.
from pydub import AudioSegment
from pydub.silence import split_on_silence
voice = AudioSegment.from_wav(path) #path to audio file
chunks = split_on_silence(voice, min_silence_len=500, silence_thresh=voice.dBFS-14, keep_silence=500,)
To try and process these chunks faster, I tried using a multi-threaded loop as shown
n_threads = len(chunks)
thread_list = []
for thr in range(n_threads):
thread = Thread(target = threaded_process, args=(chunks[thr],))
thread_list.append(thread)
thread_list[thr].start()
for thread in thread_list:
thread.join()
The function 'threaded_process' is supposed to perform the Speech-to-Text conversion
def threaded_process(chunks):
fh = open("recognized.txt", "w+")
i = 0
for chunk in chunks:
chunk_silent = AudioSegment.silent(duration = 10)
audio_chunk = chunk_silent + chunk + chunk_silent
print("saving chunk{0}.wav".format(i))
audio_chunk.export("./chunk{0}.wav".format(i), bitrate ='192k', format ="wav")
file = 'chunk'+str(i)+'.wav'
print("Processing chunk "+str(i))
rec = audio_to_text(file) #Another function which actually does the Speech to text conversion(IBM Watson SpeechToText API)
if rec == "Error5487":
return "Error5487E"
fh.write(rec+" ")
os.remove(file)
i += 1
fh.close()
But the conversion is done using the earlier method and not using multithreading. I also get this message- [WinError 32] The process cannot access the file because it is being used by another process: 'chunk0.wav' Why is this happening?