Highest Voted 'openai-whisper' Questions

12

votes

3 answers

How can I get word-level timestamps in OpenAI's Whisper ASR?

I use OpenAI's Whisper python lib for speech recognition. How can I get word-level timestamps? To transcribe with OpenAI's Whisper (tested on Ubuntu 20.04 x64 LTS with an Nvidia GeForce RTX 3090): conda create -y --name whisperpy39…

asked Sep 23 '22 at 02:15

Franck Dernoncourt

77,520
72
342
501

8

votes

6 answers

OpenAI Whisper; FileNotFoundError: [WinError 2] The system cannot find the file specified

I wanted to check out OpenAI whisper and see if I could find some personal applications for it. I went on github and followed the instructions to set it up. My primary system is on Windows 11 and I get this error; "FileNotFoundError: [WinError 2]…

python-3.x filenotfounderror openai-whisper

asked Sep 25 '22 at 15:08

Gordon Freeman

133
1
1
5

4

votes

1 answer

Python Poetry fails to add openai-whisper due to triton installation error

so im trying to use openai-whisper. i'm using poetry as my env and dependecy manager. but keep getting errors when trying to download it. the error i get is, Installing triton (2.0.0): Failed i tried the typical poetry add and this is the…

python-3.x python-poetry openai-whisper

asked May 24 '23 at 22:54

Brooks

41
2

4

votes

1 answer

whisper api from a recorded audio blob

I am creating a transcriber using openAI whisper API in nodejs and react. I want the user to be able to record an audio file in the browser and transcribe their recording. i am doing this by saving the buffer data of the audio blob they have…

node.js reactjs openai-api recorder openai-whisper

asked Apr 25 '23 at 19:46

Leviathan X

71
3

4

votes

1 answer

(Mis)-using open.ai whisper for text-to-text translation

I noticed that transcribing speech in multiple languages with openai whisper speech-to-text library sometimes accurately recognizes inserts in another language and would provide the expected output, for example: 八十多个人 is the same as 八十几个人. So 多 and…

machine-learning speech-recognition speech-to-text machine-translation openai-whisper

asked Dec 03 '22 at 15:12

ccpizza

28,968
18
162
169

4

votes

1 answer

How can I give some hint phrases to OpenAI's Whisper ASR?

I use OpenAI's Whisper python lib for speech recognition. How can I give some hint phrases, as it can be done with some other ASR such as Google? To transcribe with OpenAI's Whisper (tested on Ubuntu 20.04 x64 LTS with an Nvidia GeForce RTX…

python speech-recognition openai-api openai-whisper hint-phrases

asked Sep 24 '22 at 00:04

Franck Dernoncourt

77,520
72
342
501

3

votes

2 answers

whisper AI error : FP16 is not supported on CPU; using FP32 instead

I'm trying to use whisper AI on my computer. I have a NVIDIA GPU RTX 2060, installed CUDA and FFMPEG. I'm running this code : import whisper model = whisper.load_model("medium") result =…

python speech-recognition text-to-speech openai-api openai-whisper

asked Apr 01 '23 at 19:32

athem boukhmayer

31
1
3

3

votes

2 answers

sending audio file to open ai whisper model

I am converting my recorded audio file to a blob object and then reading it with file reader to make a post request to open ai whisper model It expects a audio file and model name i.e whisper-1 The error i am getting is 1 validation error for…

javascript blob filereader openai-api openai-whisper

asked Mar 27 '23 at 14:48

manik-sharma

31
4

3

votes

0 answers

How can I deactivate OpenAI Whisper's normalization for audio input longer than 30 secs? (transcribing filler words)

OpenAI's Whisper delivers nice and clean transcripts. Now I would like it to produce more raw transcripts that also have filler words (ah, mh, mhm, uh, oh, etc.) in it. The post here tells me that it's possible by setting the normalization to false:…

python python-3.x speech-recognition openai-whisper

asked Mar 27 '23 at 08:52

Psychic Birdy

131
2

3

votes

0 answers

How to load a pytorch model directly to the GPU

I'm trying to load the whisper large v2 model into a GPU but in order to do that, it seems that pytorch unpickle the whole model using CPU's RAM using more than 10GB of memory, and then it load's it into the GPU memory. Pytorch's torch.load…

pytorch openai-whisper

asked Mar 08 '23 at 16:42

Miguel Pinheiro

192
11

3

votes

2 answers

Error audio loading when runing Whisper Open AI model

The problem im trying to solve is that i cant run Whisper model for some audio, it says something related to audio decoding. payload.wav: Invalid data found when processing input. raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from…

python python-3.x audio openai-whisper

asked Mar 03 '23 at 09:49

John mick

31
1
2

3

votes

2 answers

Cant create .exe with PyInstaller when OpenAI's Whisper is imported

I am trying to create a small program that works with OpenAI's Whisper. I then build the Python script to the .exe file using PyInstaller (auto-py-to-exe). When I run the .exe file, I get the following error: Traceback (most recent call last): …

python python-3.x pyinstaller auto-py-to-exe openai-whisper

asked Feb 28 '23 at 16:16

Arthy01

41
4

3

votes

1 answer

Open AI Whisper is returning the transcription in English instead of the native language

When I use the open AI whisper model on Hindi audio, it returns the transcription in English instead of Hindi. How do I get the output in Hindi itself? Is there a setting that can be changed? mel =…

python speech-recognition openai-whisper

asked Oct 02 '22 at 22:30

Sarath Haridas

31
2

3

votes

1 answer

How can I finetune a model from OpenAI's Whisper ASR on my own training data?

I use OpenAI's Whisper python lib for speech recognition. I have some training data: either text only, or audio + corresponding transcription. How can I finetune a model from OpenAI's Whisper ASR on my own training data?

python speech-recognition openai-api fine-tune openai-whisper

asked Sep 25 '22 at 00:28

Franck Dernoncourt

77,520
72
342
501

2

votes

0 answers

How to map word level timestamps to text of a given transcript?

I am currently developing a tool to visualize song lyrics. The tool computes the similarity in the phonetics of syllables and assigns a rhyme group to each syllable. Syllables belonging to the same group will be highlighted in the same color. To…

json nlp mapping sentence-similarity openai-whisper

asked Jun 28 '23 at 15:06

paulpelikan

21
2

Questions tagged [openai-whisper]