0

I have some text-to-speech code that gives me a numpy array for its audio output. I can export this audio array to a WAV file like so:

sample_rate = 48000

audio_normalized = audio
audio_normalized = audio_normalized / np.max(np.abs(audio_normalized))

# [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html][scipy.io.wavfile.write — SciPy v1.10.0 Manual]]
scipy.io.wavfile.write(output_path, sample_rate, audio_normalized,)

But when the text is long, I get this error:

  File "/Users/evar/code/python/blackbutler/blackbutler/butler.py", line 216, in cmd_zsh
    scipy.io.wavfile.write(output_path,
  File "/Users/evar/mambaforge/lib/python3.10/site-packages/scipy/io/wavfile.py", line 812, in write
    raise ValueError("Data exceeds wave file size limit")
ValueError: Data exceeds wave file size limit

So I think I need to convert the numpy array to a small, lossy format like m4a or mp3 using Python, and then save that.

HappyFace
  • 3,439
  • 2
  • 24
  • 43

1 Answers1

1

Check out pydub (functions pydub .AudioSegment() and .export()). It can save to mp3 from numpy. Related questions: How to read a MP3 audio file into a numpy array / save a numpy array to MP3? How to convert a numpy array to a mp3 file

If to stay on .wav is more prefered, actually:

  • you can devide your output into fixed length parts and keep storing them as .wav
  • you can also resample (lower sample rate) your audio using librosa .resample() or change bit depth using Soundfile .write().
BEBROID
  • 33
  • 7