Whisper is a general-purpose speech recognition library by OpenAI.
Questions tagged [openai-whisper]
175 questions
12
votes
3 answers
How can I get word-level timestamps in OpenAI's Whisper ASR?
I use OpenAI's Whisper python lib for speech recognition. How can I get word-level timestamps?
To transcribe with OpenAI's Whisper (tested on Ubuntu 20.04 x64 LTS with an Nvidia GeForce RTX 3090):
conda create -y --name whisperpy39…

Franck Dernoncourt
- 77,520
- 72
- 342
- 501
8
votes
6 answers
OpenAI Whisper; FileNotFoundError: [WinError 2] The system cannot find the file specified
I wanted to check out OpenAI whisper and see if I could find some personal applications for it.
I went on github and followed the instructions to set it up.
My primary system is on Windows 11 and I get this error; "FileNotFoundError: [WinError 2]…

Gordon Freeman
- 133
- 1
- 1
- 5
4
votes
1 answer
Python Poetry fails to add openai-whisper due to triton installation error
so im trying to use openai-whisper. i'm using poetry as my env and dependecy manager. but keep getting errors when trying to download it. the error i get is, Installing triton (2.0.0): Failed
i tried the typical poetry add and this is the…

Brooks
- 41
- 2
4
votes
1 answer
whisper api from a recorded audio blob
I am creating a transcriber using openAI whisper API in nodejs and react. I want the user to be able to record an audio file in the browser and transcribe their recording. i am doing this by saving the buffer data of the audio blob they have…

Leviathan X
- 71
- 3
4
votes
1 answer
(Mis)-using open.ai whisper for text-to-text translation
I noticed that transcribing speech in multiple languages with openai whisper speech-to-text library sometimes accurately recognizes inserts in another language and would provide the expected output, for example: 八十多个人 is the same as 八十几个人. So 多 and…

ccpizza
- 28,968
- 18
- 162
- 169
4
votes
1 answer
How can I give some hint phrases to OpenAI's Whisper ASR?
I use OpenAI's Whisper python lib for speech recognition. How can I give some hint phrases, as it can be done with some other ASR such as Google?
To transcribe with OpenAI's Whisper (tested on Ubuntu 20.04 x64 LTS with an Nvidia GeForce RTX…

Franck Dernoncourt
- 77,520
- 72
- 342
- 501
3
votes
2 answers
whisper AI error : FP16 is not supported on CPU; using FP32 instead
I'm trying to use whisper AI on my computer. I have a NVIDIA GPU RTX 2060, installed CUDA and FFMPEG.
I'm running this code :
import whisper
model = whisper.load_model("medium")
result =…

athem boukhmayer
- 31
- 1
- 3
3
votes
2 answers
sending audio file to open ai whisper model
I am converting my recorded audio file to a blob object and then reading it with file reader to make a post request to open ai whisper model
It expects a audio file and model name i.e whisper-1
The error i am getting is
1 validation error for…

manik-sharma
- 31
- 4
3
votes
0 answers
How can I deactivate OpenAI Whisper's normalization for audio input longer than 30 secs? (transcribing filler words)
OpenAI's Whisper delivers nice and clean transcripts. Now I would like it to produce more raw transcripts that also have filler words (ah, mh, mhm, uh, oh, etc.) in it. The post here tells me that it's possible by setting the normalization to false:…

Psychic Birdy
- 131
- 2
3
votes
0 answers
How to load a pytorch model directly to the GPU
I'm trying to load the whisper large v2 model into a GPU but in order to do that, it seems that pytorch unpickle the whole model using CPU's RAM using more than 10GB of memory, and then it load's it into the GPU memory.
Pytorch's torch.load…

Miguel Pinheiro
- 192
- 11
3
votes
2 answers
Error audio loading when runing Whisper Open AI model
The problem im trying to solve is that i cant run Whisper model for some audio, it says something related to audio decoding. payload.wav: Invalid data found when processing input. raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from…

John mick
- 31
- 1
- 2
3
votes
2 answers
Cant create .exe with PyInstaller when OpenAI's Whisper is imported
I am trying to create a small program that works with OpenAI's Whisper. I then build the Python script to the .exe file using PyInstaller (auto-py-to-exe). When I run the .exe file, I get the following error:
Traceback (most recent call last):
…

Arthy01
- 41
- 4
3
votes
1 answer
Open AI Whisper is returning the transcription in English instead of the native language
When I use the open AI whisper model on Hindi audio, it returns the transcription in English instead of Hindi.
How do I get the output in Hindi itself? Is there a setting that can be changed?
mel =…

Sarath Haridas
- 31
- 2
3
votes
1 answer
How can I finetune a model from OpenAI's Whisper ASR on my own training data?
I use OpenAI's Whisper python lib for speech recognition. I have some training data: either text only, or audio + corresponding transcription. How can I finetune a model from OpenAI's Whisper ASR on my own training data?

Franck Dernoncourt
- 77,520
- 72
- 342
- 501
2
votes
0 answers
How to map word level timestamps to text of a given transcript?
I am currently developing a tool to visualize song lyrics. The tool computes the similarity in the phonetics of syllables and assigns a rhyme group to each syllable. Syllables belonging to the same group will be highlighted in the same color. To…

paulpelikan
- 21
- 2