0

I have a large audio file streaming from a web service.

I would like to load the audio data into librosa for batched stream analysis.

I took a look at librosa.core.stream where the description mentiones:

Any codec supported by soundfile is permitted here.

But I can't seem to figure out how I can feed the binary batch data from requests:

import requests
import numpy as np

audio_url = "http://localhost/media/audioplayback.m4a"

response = requests.get(
    audio_url,
    stream=True,
)

for chunk in response.iter_content(chunk_size=4096):
    npChunk = np.frombuffer(chunk, dtype=np.float64)
    # Load chunk data into librosa

I know I need to convert the audio format but I'm not sure what is the recommended way to do this. I know it is possible to load the data directly into numpy array instead of calling librosa.stream. But I can't figure out the combination of soundfile, audioread, or GStreamer to do the format conversion.

I am using python==3.6.5 inside conda environtment inside Windows Subsystem for Linux

Any help would be greatly appreciated! Thank you!

Rohit Mistry
  • 113
  • 3
  • 11
  • I am testing with youtube-dl audio URL as an [example](https://stackoverflow.com/a/50881927/9168936) – Rohit Mistry Aug 10 '19 at 23:13
  • **Update**: made some progress using ffmpeg using [this approach](http://zulko.github.io/blog/2013/10/04/read-and-write-audio-files-in-python-using-ffmpeg/) but the amplitude data seems corrupted when looking at line-plot or playing back in `IPython.display.Audio(...)` – Rohit Mistry Aug 11 '19 at 01:12
  • Can you provide a example URL with the kind of audio stream you look at? – Jon Nordby Aug 16 '19 at 10:08
  • This comment has some hints on how to do this with Gstreamer https://stackoverflow.com/questions/3507746/use-python-gstreamer-to-decode-audio-to-pcm-data – Jon Nordby Aug 16 '19 at 10:28
  • audiofile does not look to take anything but a file path. soundfile supports Filelike objects, but expects the entire file to be available (no streaming). – Jon Nordby Aug 16 '19 at 11:11
  • An example audio URL from youtube is too long for SO markdown comment. I just get the URL using command `youtube-dl --skip-download --extract-audio --format bestaudio --get-url https://www.youtube.com/watch?v=Sv7y4rbm-9Q` – Rohit Mistry Aug 17 '19 at 18:24

2 Answers2

0

Use this:

class PredictionAudioURLLoader:
    async def __call__(self, Radio_STREAM_URL, timeout):
        """
        Returns byte data from an audio stream URL.
        Args:
            Radio_STREAM_URL (str): The URL for the audio stream.
            timeout (int): The maximum number of seconds to wait before timing out.
        
        Returns:
            bytes: The byte data from the audio stream.
        """
        async with ClientSession() as session:
            async with session.get(Radio_STREAM_URL) as resp:
                try:
                    return await resp.content.read()
                except asyncio.TimeoutError:
                    print(f"Timeout reached after {timeout} seconds.")


class Streamer:
    """
    Stream class for listening to a radio station stream session for 20 seconds.
    """
    def __init__(self):
        self.STREAM_PATH = "/"
        self.BUFFER_SIZE = 4096
    

    async def runstream(self, Radio_STREAM_URL: str) -> np.ndarray:
        """
        Reads the audio stream and returns the audio data as a NumPy array.
        
        Args:
            Radio_STREAM_URL (str): The URL for the audio stream.
            
        Returns:
            np.ndarray: The audio data as a NumPy array.
        """
        try:
            loader = PredictionAudioURLLoader()
            response_futures = await loader(Radio_STREAM_URL, 600)
            audio_data, sr = librosa.load(io.BytesIO(response_futures))
            return audio_data
        except Exception as e: 
            return str(e)

streamer=Streamer()
y=await streamer.runstream(station_stream_url)
# Play the audio using IPython
IPython.display.Audio(y, rate=22050)
-2

My current solution is this:

You need to install pydub

from pydub import AudioSegment

audio_bytes = []
for b in request.files['audio_data'].stream.read():
    audio_bytes += [b]
audio_bytes = bytes(audio_bytes)    
s = io.BytesIO(audio_bytes)
audioObj = AudioSegment.from_file(s)
audioObj = audioObj.set_sample_width(2).set_frame_rate(16000).set_channels(1)
audioObj.export("input_audio.wav")
wav, sr = librosa.load("input_audio.wav")
wav = librosa.core.resample(wav, sr, 16000)
return wav

my frontend code is this:

recorder.onComplete = function(recorder, blob) { 
            console.log("Encoding complete");
            createDownloadLink(blob,recorder.encoding);

            //START AJAX HERE
            var fd = new FormData();
            fd.append('audio_data', blob);
            console.log('Transcribing...')
            document.getElementById('res_stat').innerHTML = "Waiting For Server's Response"

            $.ajax({
            type: 'POST',
            url: "/audio",
            data: fd,
            processData: false,
            contentType: false,
            dataType: "json",
            success: function(text){
                                    console.log("Output Received")

                                    document.getElementById("predOut").value = text.text;
                            }
            });

            console.log("Waiting For Server's Response")
        }
thethiny
  • 1,125
  • 1
  • 10
  • 26
  • 1
    Thanks @thethiny, but this is not helpful for my needs. I'd have to use pydub to load the entire track to memory, then save to disk, then load from disk to librosa. That is a trivial workaround but I'd like to work with a stream instead. Where the length of the stream may be unknown at run-time – Rohit Mistry Sep 14 '19 at 16:25