Transcribe only specific portions of audio using Google Cloud speech-to-text

Question

I can't seem to find any instructions in the documentation. I am able to successfully transcribe audio from Google Cloud storage, but it transcribes the whole file. In order to save on costs, I would like to transcribe only portions of the audio, preferably using timestamps. Is there a method or variable to do this?

Split the audio based upon timestamps first. https://stackoverflow.com/questions/37999150/how-to-split-a-wav-file-into-multiple-wav-files — John Hanley, Jul 17 '21 at 03:21

score 1 · Answer 1 · answered Jul 21 '21 at 14:11

1

You can split the audio file based on timestamp first, as suggested in the comment. The following Python code taken from this Stackoverflow link can be used for the same.

from pydub import AudioSegment 
t1 = t1 * 1000 #Works in milliseconds
t2 = t2 * 1000 
newAudio = AudioSegment.from_wav("oldSong.wav") 
newAudio = newAudio[t1:t2] 
newAudio.export('newSong.wav', format="wav") #Exports to a wav file in the current path.

The code uses Pydub library which supports various audio file formats such as WAV, mp3, flv etc.

answered Jul 21 '21 at 14:11

Krish

752
3
10

I appreciate the assistance; however I wanted to know if the Google API has a way to do this in the cloud. Looks like there isn't a way. The way my program is coded, it will take longer to split the audio first. The help is much appreciated nevertheless. – James Kurian Jul 23 '21 at 04:22
1

@JamesKurian Unfortunately, the Speech-to-Text API does not currently support transcribing specific portions of the audio file, but we are working on it. We cannot provide ETA at the moment, but you can follow its progress on the [issue tracker](https://issuetracker.google.com/128645740) and you can ‘STAR’ the issue to receive automatic updates and give it traction by referring to this [Link](https://developers.google.com/issue-tracker/guides/subscribe#starring_an_issue). – Krish Jul 27 '21 at 14:24

Transcribe only specific portions of audio using Google Cloud speech-to-text

1 Answers1