I'm trying to build a voice assistant. I'm using speech_recognition library for the same. Below is my code:
def takeCommand():
print("taking command")
r = Recognizer()
m = Microphone()
with m as source:
try:
print("speak...\n")
r.adjust_for_ambient_noise(source=source, duration=0.5)
audio = r.listen(source=source, timeout=5)
command = None
command = r.recognize_google(audio_data=audio).lower()
print(command)
# exporting the result
with open('recorded_test.txt',mode ='w') as file:
file.write("Recognized text:")
file.write("\n")
file.write(command)
print("ready!")
return command.lower()
except Exception as e:
print("command could not be parsed")
print('Exception: ' + str(e) )
return None
Below is the code for the conversation function which I wrote for my assistant to give appropriate answers to my query:
def conversation():
while True:
question = takeCommand()
if question is not None:
print("entered if, command not empty")
if "thank you" in question and "jarvis" in question:
speak("shutting down, sir!")
break
elif "goodbye" in question and "jarvis" in question:
speak("shutting down, sir!")
break
elif question:
# some code to run appropriate function to accomplish the task
else:
speak("I'm afraid I couldn't recognize what you said. Please say again!")
continue
I'm calling takeCommand() function inside my conversation() function. The thing which I want to accomplish here is that if the user is giving some command then the flow of control of my conversation function should be the same as per the if-else statements.
But, if the user is voluntarily choosing not to give any further commands then instead of running else in the conversation statement, the flow should go back to the main() from where I called the conversation() function.
So basically the code should understand on its own when the user is giving commands and when it's not.
Proposed Solution: I'm thinking of accomplishing this by running two functions in parallel to each other. One function will continuously take the audio input. In the meanwhile, other function will process it with packages like librosa or sounddevice to check the decibel levels of the audio input. if the levels are above certain threshold then the recorded audio should be sent to get recognized with speech_recognition library otherwise not.
Can anyone help me with accomplishing the above task?
It doesn't need to align with my proposed solution. If there is better way to accomplish this then do tell. I'm thinking ML and AI but don't know how.