1

I want to record audio in realtime on Ubuntu and pyalsaaudio seems to work best for detecting my input devices correctly. I started off with the included recordtest.py script, and wanted to experiment with latency to see when the buffer would fill up and give me an error (or at least return -EPIPE) - as per the pyalsaaudio documentation for PCM.read():

In case of an overrun, this function will return a negative size: -EPIPE. This indicates that data was lost, even if the operation itself succeeded. Try using a larger periodsize.

However, a tiny buffer size wasn't causing problems, so to further investigate I added in huge time.sleep()'s in between calls to read() in recordtest.py:

inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NONBLOCK, 
    channels=1, rate=44100, format=alsaaudio.PCM_FORMAT_S16_LE, 
    periodsize=160, device=device)

loops_with_data = 3000 #3000*160/44100 = 10.9 seconds of audio
first_time = True
while loops_with_data > 0:
    # Read data from device
    l, data = inp.read()
    print("l:",l)

    if l:
        f.write(data)
        if first_time:
            #big delay after first data read
            time.sleep(100)
            first_time = False
        else:
            #smaller delay otherwise, still longer than one period length
            time.sleep(.01)
        loops_with_data-=1

I would've expected this to overrun the buffer - however, the value of l returned by read() is never negative, and almost always 160. When I play back the audio, I get a perfect recording of the first 10.9 seconds of what I said into the microphone. Somehow it seems that the buffer being used is huge, storing over 100 seconds worth of audio so that when read() is called 100 seconds later, it can still access all the old periods of frames. The problem with this is that if my application runs a function in between calls to read() that take too long, the audio will keep getting more and more delayed and I'll be none the wiser, since nothing indicates that this is happening.

I've tried digging into alsaaudio.c, and have discovered some weirdness - no matter what I do,the PCM object always seems to think it has a buffer size of a reasonable number of frames (assuming frames = audio samples), but buffer time and number of periods per buffer always show up as 0. I've tried printing this using inp.info() in python, and printing in the c file itself. It's extra weird because the c file is clearly trying to set 4 periods per buffer using snd_pcm_hw_params_set_periods_near():

dir = 0;
unsigned int periods = 4;
snd_pcm_hw_params_set_periods_near(self->handle, hwparams, &periods, &dir);

But after the following line, periods gets set to 0:

/* Query current settings. These may differ from the requested values, 
which should therefore be synced with actual values */

snd_pcm_hw_params_current(self->handle, hwparams);

I've tried all sorts of other functions (like snd_pcm_hw_params_set_periods_min() and snd_pcm_hw_params_set_periods_max()) with no luck.

Murph
  • 41
  • 4
  • Have you solved the problem or is it still worth looking into? – Ronald van Elburg Oct 14 '22 at 21:45
  • I never did solve it, would still love help! – Murph Oct 16 '22 at 18:35
  • About periodsize there is an open documentation issue: https://github.com/larsimmisch/pyalsaaudio/issues/110 That is not a solution but just a bit of relevant background info. – Ronald van Elburg Oct 16 '22 at 21:36
  • Threading or multiprocessing might offer a solution for your problem. But I need to think a bit more about it, and get back into alsa. Questions: 1. do you care about losing data? 2. Can you say a bit more about your usecase? As that drives the balance required between real time behavior and keeping your data continuous. – Ronald van Elburg Oct 16 '22 at 21:45

1 Answers1

0

The function snd_pcm_drop allows you to drop the contents of the buffer. This function is already available from pyalsaaudio as the drop method for a PCM device.

After:

#big delay after first data read
            time.sleep(100)

you can simply add

            inp.drop()

All input that arrived before calling drop() will be ignored. (But there is still some sound from the start of the script in the scripts own data variable)

More subtle solutions seem possible, but would require adding snd_pcm_forward and perhaps snd_pcm_forwardable to the pyalsaaudio interface.

Here the complete modified script I used for analysis and testing. (I shortened the big delay to 4 seconds.) I also used soundfile for wav-file creation as audacity wasn't happy with the other method of creating wav-files.

import time
import alsaaudio
import numpy as np
import struct
import soundfile as sf

conversion_dicts = {
        alsaaudio.PCM_FORMAT_S16_LE: {'dtype': np.int16, 'endianness': '<', 'formatchar': 'h', 'bytewidth': 2},
}

def get_conversion_string(audioformat, noofsamples):
    conversion_dict = conversion_dicts[audioformat]
    conversion_string = f"{conversion_dict['endianness']}{noofsamples}{conversion_dict['formatchar']}"
    return conversion_string

device = 'default'
fs = 44100

inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NONBLOCK, 
    channels=1, rate=fs, format=alsaaudio.PCM_FORMAT_S16_LE, 
    periodsize=160, device=device)

print(inp.info())

f = sf.SoundFile("test.wav", 'wb', samplerate=fs, channels=1)

dtype = np.int16 

loops_with_data = 3000 #3000*160/44100 = 10.9 seconds of audio
first_time = True

while loops_with_data > 0:
    # Read data from device
    l, rawdata = inp.read()
    
    conversion_string = get_conversion_string(alsaaudio.PCM_FORMAT_S16_LE, l)
    data = np.array(struct.unpack(conversion_string, rawdata), dtype=dtype)
    

    if l > 0:
        print(f"\r{loops_with_data:4}", end='')
        f.write(data)
        if first_time:
            #big delay after first data read
            time.sleep(4)
            inp.drop()
            first_time = False
        else:
            #smaller delay otherwise, still longer than one period length
            time.sleep(.01)
        loops_with_data-=1
    else:
        print(".", end='')
        
f.close()
  • Using snd_pcm_forward I think it would be possible to keep say the last second from before the moment you decide you want to get the sound from the buffer. – Ronald van Elburg Oct 17 '22 at 14:52