Algorithm to deal with Audio click/pop sounds

Question

I am making a sound engine where I can play and stop sound. My issue is if a user wants to stop the sound I immediately stop it ie I send 0 as PCM value. This has the consequence of producing a pop / click sound because the PCM value drops from lets say 0.7 to 0 immediately causing a pop/click sound which is very annoying to hear.

Here is a discussion about this.

I am looking for an algorithm or a way to deal with these audio clicks / pops. What is the best practice for dealing audio clicks? Is there a universal way to go about this? I am very new to audio DSP and I could not find a good answer for this.

score 3 · Accepted Answer · answered Apr 10 '21 at 15:34

3

When you cut off the sound abruptly, you are multiplying it by a step-shaped signal.

When you multiply two signals together, you convolve their frequencies. A step-shape has energy at all frequencies, so the multiplication will spread the energy from the sound over all frequencies, making an audible pop.

Instead, you want to fade the sound out over 30ms or so -- that is still very fast, and will sound like an abrupt stop, but there will be no audible pop.

You should use a curve shaped like 1-t² to modulate the volume, or something else without significant high-frequency components. That way, when it is convolved with the original sound in the frequency domain, it won't produce any new frequencies.

answered Apr 10 '21 at 15:34

Matt Timmermans

53,709
3
46
87

I tried this yesterday, 23 ms worked fine with linear slope but sometimes I would get pops. I will try `1-t'2` curve now, Amazing explanation! – cs guy Apr 10 '21 at 15:52
also let me ask you this: when playing an audio from silence the same problem occurs, is it reasonable to use this curve again or do I need another curve? – cs guy Apr 10 '21 at 18:20
the curve i mean is `t^2` i was not clear there, for fade in – cs guy Apr 10 '21 at 18:49
If a curve works in one direction, it should work in the reverse. – Phil Freihofner Apr 10 '21 at 20:57
@csguy `1-t^2` is for lead-in too, with negative t, since it's symmetric. As Phil says if it works one way then it works in the other way. The actual curve you use is not too critical. Half of any of the common "window functions" works fine. – Matt Timmermans Apr 10 '21 at 21:05
@MattTimmermans, given the "convolution" explanation, I'm wondering if symmetric easing is preferred, or if only the silent end or the full volume end requires the easing. Linear transition, with 1024 steps has worked for me pretty well for my real-time volume faders. Computationally, this only requires adding a constant increment (+/- 1/1024) to the volume factor for each frame. But am now thinking, maybe add a linear transition to the linear transition amount, to get an easy-to-compute easing (no exponents needed on per-frame basis, just two additions). Any thoughts on that? – Phil Freihofner Apr 10 '21 at 21:44
2

Easing at both ends can provide some benefit. Linear is already not too bad, though, so I don't think it's critical unless you're interested in squeezing the duration as much as you can. Have a look at the MDCT-compatible windows here: https://en.wikipedia.org/wiki/Modified_discrete_cosine_transform#Window_functions The cosine window is a lot like 1-t^2. The other choices are a little better. If you're interested in something you can do easily and incrementally, then passing your linear ramp though an integrator-comb filter is reasonable. You don't need any memory if you use 2 ramps. – Matt Timmermans Apr 10 '21 at 22:17

Algorithm to deal with Audio click/pop sounds

1 Answers1