0

I'm using Google AIY Voice Kit (v2) with Raspberry Pi Zero to build a voice-control robot. It's working great! But I have an elementary question. While the robot is processing user speech (and deciding how to respond) I want to play a short sound file to indicate the robot is "thinking." The sound file is currently playing too loud. How to set the playback volume of a soundfile in python?

Here's a snippet of code:

aiy.voice.audio.play_wav_async("think.wav")

This plays successfully, but I can't figure out how to set the volume the way I can set volume in the text to speech function aiy.voice.tts.say(sentence, lang='en-GB', volume=10, pitch=120, speed=90, device='default')

Many thanks for any suggestions!

  • Please feel free to comment on my answer and let me know if I was not able to answer your question. If I was, then please let me know by accepting my answer -- although in this case, I would almost recommend against that at least for the time being as someone else may have a much simpler answer. – Cfomodz Jun 30 '21 at 18:47

1 Answers1

0

So this is a very hacky way to look at this problem, but after reading the AIY documentation and seeing that it is straight-up reading the bytes of the file pointer with no option to set volume [or anything else] I think hacks might be the best route.

Let's take your input file, modify it, then save it back as a tmp file to be read into AIY.

We can do something like the following:

# Throw this line somewhere higher up so you can edit it
# like you would the volume level in 
# aiy.voice.tts.say(sentence, lang='en-GB', volume=10, pitch=120, speed=90, device='default')
# or just replace the later reference with the int and modify that line directly instead
HOW_MUCH_QUIETER: 10
from pydub import AudioSegment

song = AudioSegment.from_wav("think.wav")

# reduce volume by 10 dB
song_10_db_quieter = song - HOW_MUCH_QUIETER

# save the output
song.export("quieter.wav", "wave")

aiy.voice.audio.play_wav_async("quieter.wav")
Cfomodz
  • 532
  • 4
  • 17
  • 1
    Wow--this is a truly fiendish idea! Realistically, I only need 3 volume levels. I'll create 3 soundfiles, one for each volume level, and then play whichever one best suits the volume I'm using for the tts.say function. Thanks for this great suggestion. I will leave the question unanswered for a day or two in case there's something obvious we're missing. – Time Lord Jun 30 '21 at 23:23
  • Sounds good, so I am sure you can see where to go with that, but if not you could just hard code the 3 levels then and save it as 3 different temp files. If you need any help with the code just let me know. – Cfomodz Jul 01 '21 at 17:54