How to convert a numpy array to a mp3 file

Question

I am using the soundcard library to record my microphone input, it records in a NumPy array and I want to grab that audio and save it as an mp3 file.

Code:

import soundcard as sc
import numpy 
import threading


speakers = sc.all_speakers() # Gets a list of the systems speakers
default_speaker = sc.default_speaker() # Gets the default speaker
mics = sc.all_microphones() # Gets a list of all the microphones


default_mic = sc.get_microphone('Headset Microphone (Arctis 7 Chat)') # Gets the default microphone


# Records the default microphone
def record_mic():
  print('Recording...')
  with default_mic.recorder(samplerate=48000) as mic, default_speaker.player(samplerate=48000) as sp:
      for _ in range(1000000000000):
          data = mic.record(numframes=None) # 'None' creates zero latency
          sp.play(data) 
          
          # Save the mp3 file here 


recordThread = threading.Thread(target=record_mic)
recordThread.start()

Akshay Sehgal · Accepted Answer · 2021-02-14T01:33:31.113

1

With Scipy (to wav file)

You can easily convert to wav and then separately convert wav to mp3. More details here.

from scipy.io.wavfile import write

samplerate = 44100; fs = 100
t = np.linspace(0., 1., samplerate)

amplitude = np.iinfo(np.int16).max
data = amplitude * np.sin(2. * np.pi * fs * t)

write("example.wav", samplerate, data.astype(np.int16))

With pydub (to mp3)

Try this function from this excellent thread -

import pydub 
import numpy as np

def write(f, sr, x, normalized=False):
    """numpy array to MP3"""
    channels = 2 if (x.ndim == 2 and x.shape[1] == 2) else 1
    if normalized:  # normalized array - each item should be a float in [-1, 1)
        y = np.int16(x * 2 ** 15)
    else:
        y = np.int16(x)
    song = pydub.AudioSegment(y.tobytes(), frame_rate=sr, sample_width=2, channels=channels)
    song.export(f, format="mp3", bitrate="320k")

#[[-225  707]
# [-234  782]
# [-205  755]
# ..., 
# [ 303   89]
# [ 337   69]
# [ 274   89]]

write('out2.mp3', sr, x)

Note: Output MP3 will of cause be 16-bit, because MP3s are always 16 bit. However, you can set sample_width=3 as suggested by @Arty for 24-bit input.

edited Feb 14 '21 at 01:33

answered Feb 14 '21 at 01:20

Akshay Sehgal

18,741
3
21
51

I think `sample_width=2` controls that it is 16-bit data. Probably to support 24-bit you have just to change to `sample_width=3`. – Arty Feb 14 '21 at 01:26
24-bit are common for WAV but i am not sure if mp3 are that. As far as i know they have 16-bit. but i may be wrong. – Akshay Sehgal Feb 14 '21 at 01:28
Yes MP3s are 16-bit, but I thought by phrase `Note: It only works for 16-bit files` you meant that your code doesn't support 24-bit INPUT data, because line of code having `sample_width=2` is related to input samples, not to output MP3. So to support input 24-bit sample just do `sample_width=3`, and output MP3 will of cause be 16-bit, because MP3s are always 16 bit. – Arty Feb 14 '21 at 01:31
ah ok my bad.ill update it to be less confusing. – Akshay Sehgal Feb 14 '21 at 01:32
When I run the code I get the error "Exception has occurred: NameError name 'sr' is not defined" – ahmedquran12 Feb 14 '21 at 02:11
you will have to pass the frame_rate yourself. thats the sr. – Akshay Sehgal Feb 14 '21 at 02:20
As per official documentation `Also known as sample rate, CD Audio has a 44.1kHz sample rate, which means frame_rate will be 44100. Common values are 44100 (CD), 48000 (DVD), 22050, 24000, 12000 and 11025` – Akshay Sehgal Feb 14 '21 at 02:22
Try `sr = 11025 `, or `sr = 44100` if you are not sure. – Akshay Sehgal Feb 14 '21 at 02:22
What will 'x' be? – ahmedquran12 Feb 14 '21 at 03:32
When I try an export it, I get an error: 'Exception has occurred: FileNotFoundError [WinError 2] The system cannot find the file specified'. F is equal to 'output/file.mp3' – ahmedquran12 Feb 14 '21 at 04:54

score 0 · Answer 2 · answered Dec 23 '22 at 15:33

As of now the accepted answer produces extremely distorted sound atleast in my case so here is the improved version :

#librosa read 
y,sr=librosa.load(dir+file,sr=None)
y=librosa.util.normalize(y)

#pydub read
sound=AudioSegment.from_file(dir+file)
channel_sounds = sound.split_to_mono()
samples = [s.get_array_of_samples() for s in channel_sounds]
fp_arr = np.array(samples).T.astype(np.float32)
fp_arr /= np.iinfo(samples[0].typecode).max

fp_arr=np.array([x[0] for x in fp_arr])
#i normalize the pydub waveform with librosa for comparison purposes
fp_arr=librosa.util.normalize(fp_arr)

so you read the audiofile from any library and you have a waveform then you can export it to any pydub supported codec with this code below, i also used librosa read waveform and it works perfect.

wav_io = io.BytesIO()
scipy.io.wavfile.write(wav_io, sample_rate, waveform)
wav_io.seek(0)
sound = AudioSegment.from_wav(wav_io)

with open("file_exported_by_pydub.mp3",'wb') as af:
    sound.export(
        af,
        format='mp3',
        codec='mp3',
        bitrate='160000',
)

How to convert a numpy array to a mp3 file

2 Answers2

With Scipy (to wav file)

With pydub (to mp3)

Linked