Bad timing when playing audio files with PyGame

Question

When I play a sound every 0.5 second with PyGame:

import pygame, time

pygame.mixer.init()
s = pygame.mixer.Sound("2.wav")

for i in range(8):
  pygame.mixer.Channel(i).play(s)
  time.sleep(0.5)

it doesn't respect the timing correctly at all.

It's like there are pause of 0.2 sec than 0.7 sec then 0.2 sec again, it's very irregular.

Notes:

I know that time.sleep() is not the most accurate in the world, but even with the more accurate solutions from here, the problem is still present
Tested on a RaspberryPi
The problem is still there if I play many different files s[i].play(), with i in a big range. So the problem doesn't come from the fact it tries to replay the same file

"pause of 0.2 sec than 0.7 sec then 0.2 sec again, it's very irregular." that doesn't sound very irregular to me, it sounds cyclical. How long does the sound last? Especially since `0.7 == (sleep_time + 0.2)` — roganjosh, Mar 03 '17 at 01:28
@roganjosh why would there be such a cycle? `s.play()` is non blocking and should continue the code — Basj, Mar 03 '17 at 07:24
@roganjosh I modified the code to make use of channels, so each sound played should not interfere with another one. Problem still present... — Basj, Mar 03 '17 at 07:29
Have you tried what effect it would have without the pause? I would guess the 0.7 is based on the fact that your pause is 0.5 seconds, so basically without the pause every 0.2 a sound would be played? How long is your audio file? (if it's 0.2 secs, maybe it actually is somehow correlated) — Cribber, Mar 03 '17 at 07:38
Nearly solved with `pygame.mixer.init(frequency=44100, size=-16, channels=2, buffer=512)` but the rhythm is still a bit clumsy. I can't get a perfect "metronome" beat. — Basj, Mar 03 '17 at 08:10
It seems impossible to achieve "metronome" accuracy with individual `play()` calls. I once wrote an answer to a similar question [here](http://stackoverflow.com/a/41353909/859499), using an alternative approach to play sounds in regular intervals. It's a bit more involved, but might be interesting nevertheless. — Meyer, Mar 03 '17 at 12:01
@Meyer nice approach but not usable for me : I'm creating a drummachine and BPM should be modifiable in real time with a potentiometer, so using your approach I should recreate many times the sound array while turning the knob — Basj, Mar 03 '17 at 12:54
You could generate all possible sounds at the beginning and cache them in a list. This way, getting the right sound would be cheap. Of course, this might not be possible if memory is restricted. But for example, to cache sounds from 60-200bpm, that would only be 91 seconds total, or about 2MB at 22050Hz 8bit mono. Anyway, there might indeed be a more elegant solution, so this is just a "last resort" kind of idea. — Meyer, Mar 03 '17 at 14:30

score 1 · Answer 1 · answered Mar 03 '17 at 13:28

Here is the reason:

Even if we decrease the audio buffer to the minimum supported by the soundcard (1024 or 512 samples instead of pygame's default 4096), the differences will still be there, making irregulat what should be a "metronome beat".

I'll update with a working solution as soon as I find one. (I have a few ideas in this direction).

score 1 · Answer 2 · answered Mar 06 '17 at 16:26

1

As you wrote in your own answer, the reason for the timing problems very likely is the fact that the audio callback runs decoupled from the rest of the application.

The audio backend typically has some kind of a clock which is accessible from both inside the callback function and outside of it. I see two possible solutions:

use a library that allows you to implement the callback function yourself, calculate the starting times of your sounds inside the callback function, compare those times with the current time of the "audio clock" and write your sound to the output at the appropriate position in the output buffer.
use a library that allows you to specify the exact time (in terms of the "audio clock") when to start playing your sounds. This library would do the steps of the previous point for you.

For the first option, you could use the sounddevice module. The callback function (which you'll have to implement) will get an argument named time, which has an attribute time.outputBufferDacTime, which is a floating point value specifying the time (in seconds) when the first sample of the output buffer will be played back.

Full disclosure: I'm the author of the sounddevice module, so my recommendation is quite biased.

Quite recently, I've started working on the rtmixer module, which can be used for the second option. Please note that this is in very early development state, so use it with caution. With this module, you don't have to write a callback function, you can use the function rtmixer.Mixer.play_buffer() to play an audio buffer at a specified time (in seconds). For reference, you can get the current time from rtmixer.Mixer.time.

answered Mar 06 '17 at 16:26

Matthias

4,524
2
31
50

Very good answer! Thanks! I went nearly to the same kind of idea: sending a timestamp from the "play sound events" loop thread, that the audio callback will use in the "mixing thread", with the appropriate position. I'll definitely be using `sounddevice`, that I already use for my project http://www.samplerbox.org ! Thanks – Basj Mar 06 '17 at 19:22
Cool stuff, I'm impressed! – Matthias Mar 06 '17 at 21:12
Re @Matthias, does `rtmixer` do the actual mixing/summing in C or Cython (not cpython but cython) to speed it up? It's how I did the mixing for SamplerBox ([see here](https://github.com/josephernest/SamplerBox/blob/master/samplerbox_audio.pyx)) and I was wondering if you also noticed that doing the mixing math part in a compiled language also improved massively the perf for you. – Basj Mar 07 '17 at 13:50
@Basj I've implemented the whole audio callback function in C (see [rtmixer.c](https://github.com/mgeier/python-rtmixer/blob/master/src/rtmixer.c)) and it is passed directly to the PortAudio library (without wrapping it into a Python function), partly for speeding it up but mainly for avoiding the GIL. I like your use of Cython but it would be even better to get rid of the shared state and implement the whole callback function in Cython. We should probably continue the discussion on Github, feel free to mention me there (`@mgeier`). – Matthias Mar 07 '17 at 14:27
Thanks @Matthias. Last thing here (and then I'll continue other related discussions on Github): how to compare `time_info.outputBufferDacTime` (very useful indeed) with the time when the event is triggered? For example `time.time()` gives me 1488896540.18 whereas `outputBufferDacTime` gives 2344.26198273. How to compare them if they are not in same unit? – Basj Mar 07 '17 at 14:31
@Basj You cannot use `time.time()` here, PortAudio has its own time base. As I said in my answer, you should use `rtmixer.Mixer.time` (which is the same as [sounddevice.Stream.time](http://python-sounddevice.readthedocs.io/en/latest/#sounddevice.Stream.time)). – Matthias Mar 07 '17 at 15:36

Bad timing when playing audio files with PyGame

2 Answers2

Linked