Python Librosa with Microphone input

Question

So I am trying to get librosa to work with a microphone input instead of just a wav file and have been running to a few problems. Initially I use the pyaudio library to connect to the microphone but I am having trouble translating this data for librosa to use. Any suggestions on how this should be approached, or is it even possible?

A few things I tried include receiving data from pyaudio mic, decode it into an array of floats and pass it to librosa (as from the docs, this is what librosa does with wav files with .load), but it doesn't work as it produces the following error: "librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere"


FORMAT = pyaudio.paInt16
RATE = 44100
CHUNK = 2048
WIDTH = 2
CHANNELS = 2
RECORD_SECONDS = 5

stream = audio.open(format=FORMAT,
                    channels = CHANNELS,
                    rate = RATE,
                    input=True,
                    output=True,
                    frames_per_buffer=CHUNK)
while True:
        data = stream.read(CHUNK)
        data_float = np.fromstring(data , dtype=np.float16)
        data_np = np.array(data_float , dtype='d')
        # data in 1D array
        mfcc = librosa.feature.mfcc(data_np.flatten() , 44100)
        print(mfcc)

I don't think it is as simple as you make it to be. You are trying to record and process audio in real-time! — Ahmad Moussa, Nov 30 '19 at 20:55
Hey @AhmadMoussa, yea it definitely isn't as simple as I first thought it to be. Like I was following this on youtube [ https://www.youtube.com/watch?v=AShHJdSIxkY ] to generate real-time sinwave from microphone input using pyaudio and I was wondering if I can do something similar with librosa to gather information such as the MFCC in realtime, but i don't know if this is achievable, or if there is another way. Thanks again! — Vince, Dec 01 '19 at 22:25

PasNinii · Accepted Answer · 2020-06-17T19:31:01.693

You can do it using callback function from pyaudio. I think it's easier using a class.

In the constructor __init__ you define all the constant you need and you set the FORMAT to pyaudio.paFloat32 that will enable you later to use it with librosa.

Then in the start method I open the audio stream. The stream_callback parameters in the .open() let you specify the way you want to implement your function.

callback method take as argument in_data, frame_count, time_info, flag then you receive the in_data in binaries. So you need to use np.frombuffer(in_data, dtype=np.float32) to convert them into a numpy array.

Once this is done you can use your numpy.ndarray as you normally would with librosa

I think this can be optimized, but this solution works fine for me, hoping it helps :)

import numpy as np
import pyaudio
import time
import librosa

class AudioHandler(object):
    def __init__(self):
        self.FORMAT = pyaudio.paFloat32
        self.CHANNELS = 1
        self.RATE = 44100
        self.CHUNK = 1024 * 2
        self.p = None
        self.stream = None

    def start(self):
        self.p = pyaudio.PyAudio()
        self.stream = self.p.open(format=self.FORMAT,
                                  channels=self.CHANNELS,
                                  rate=self.RATE,
                                  input=True,
                                  output=False,
                                  stream_callback=self.callback,
                                  frames_per_buffer=self.CHUNK)

    def stop(self):
        self.stream.close()
        self.p.terminate()

    def callback(self, in_data, frame_count, time_info, flag):
        numpy_array = np.frombuffer(in_data, dtype=np.float32)
        librosa.feature.mfcc(numpy_array)
        return None, pyaudio.paContinue

    def mainloop(self):
        while (self.stream.is_active()): # if using button you can set self.stream to 0 (self.stream = 0), otherwise you can use a stop condition
            time.sleep(2.0)


audio = AudioHandler()
audio.start()     # open the the stream
audio.mainloop()  # main operations with librosa
audio.stop()

Thank you for the answer! I will try this out, the way I worked around it is recording for a certain amount of time (once the audio passes a set amplitude), save as wav file then use librosa, although my workaround is less desirable. :) — Vince, Jun 21 '20 at 21:16

Python Librosa with Microphone input

1 Answers1

Linked