How do I export audio stored in a numpy array in a lossy format like m4a?

Question

I have some text-to-speech code that gives me a numpy array for its audio output. I can export this audio array to a WAV file like so:

sample_rate = 48000

audio_normalized = audio
audio_normalized = audio_normalized / np.max(np.abs(audio_normalized))

# [[https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html][scipy.io.wavfile.write — SciPy v1.10.0 Manual]]
scipy.io.wavfile.write(output_path, sample_rate, audio_normalized,)

But when the text is long, I get this error:

  File "/Users/evar/code/python/blackbutler/blackbutler/butler.py", line 216, in cmd_zsh
    scipy.io.wavfile.write(output_path,
  File "/Users/evar/mambaforge/lib/python3.10/site-packages/scipy/io/wavfile.py", line 812, in write
    raise ValueError("Data exceeds wave file size limit")
ValueError: Data exceeds wave file size limit

So I think I need to convert the numpy array to a small, lossy format like m4a or mp3 using Python, and then save that.

BEBROID · Answer 1 · 2023-02-27T11:12:17.520

Check out pydub (functions pydub .AudioSegment() and .export()). It can save to mp3 from numpy. Related questions: How to read a MP3 audio file into a numpy array / save a numpy array to MP3? How to convert a numpy array to a mp3 file

If to stay on .wav is more prefered, actually:

you can devide your output into fixed length parts and keep storing them as .wav
you can also resample (lower sample rate) your audio using librosa .resample() or change bit depth using Soundfile .write().

How do I export audio stored in a numpy array in a lossy format like m4a?

1 Answers1