I'm trying to follow this example: https://www.thepythoncode.com/article/using-speech-recognition-to-convert-speech-to-text-python
I try to follow the example with speech recognition with microphone. I made this code below:
class Voice:
def __init__(self) -> None:
import speech_recognition as sr
r = sr.Recognizer()
print("Recognizing...")
with sr.Microphone() as source:
# read the audio data from the default microphone
audio_data = r.record(source, duration=5)
# convert speech to text
text = r.recognize_google(audio_data, language="es-ES")
print(text)
new_voice = Voice()
I just get the right result when I use r.record
with a specific duration like 5 sec:
audio_data = r.record(source, duration=5)
But I want something similar as the google, that recognize when the user stop to talk.
I tried this 3 others way, but without return:
audio_data = r.listen(source)
audio_data = r.listen(source, timeout=2)
audio_data = r.record(source)
The terminal doesn't give me any error, it's like it's waiting for me to talk or something.
I fount in the record method documentation:
Records up to
duration
seconds of audio fromsource
(anAudioSource
instance) starting atoffset
(or at the beginning if not specified) into anAudioData
instance, which it returns.
If
duration
is not specified, then it will record until there is no more audio input.
However i literally mute the mic after speaking and even then the terminal stayed the same.