4

Ok, I've implemented the karplus strong algorithm in C. It's a simple algorithm to simulate a plucked string sound. You start with a ring buffer of length n (n = sampling freq/freq you want), pass it through a simple two point average filter y[n] = (x[n] + x[n-1])/2, output it, and then feed it back into the delay line. Rinse and repeat. This smooths out the noise over time to create a natural plucked string sound.

But I noticed that with an integer delay line length, several high pitches could be matched to the same delay length. Also, the integer delay length doesn't allow for smoothly varying pitches (like in vibrato or glissando) I've read several papers on the extensions to the karplus algorithm, and they all talk about using either an interpolated delay line for fractional delay or an all pass filter

http://quod.lib.umich.edu/cgi/p/pod/dod-idx?c=icmc;idno=bbp2372.1997.068
http://www.jaffe.com/Jaffe-Smith-Extensions-CMJ-1983.pdf
http://www.music.mcgill.ca/~gary/courses/projects/618_2009/NickDonaldson/index.html

I've implemented interpolated delay lines before, but only on wave tables where the waveform buffer doesn't change. I just step through the delay at different rates. But what confuses me is that when it comes to the KS algorithm, the papers seem to be talking about actually changing the delay length instead of just the rate I'm stepping through it. The ks algorithm complicates things because I'm supposed to be constantly feeding values back into the delay line.

So how would I go about implementing this? Do I feed the interpolated value back in or what? Do I get rid of the two point averaging low pass filter completely?

And how would the all pass filter work? Am I supposed to replace the 2 point averaging filter with the all pass filter? How would I glide between distant pitches with glissando using the linear interpolation method or allpass filter method?

P i
  • 29,020
  • 36
  • 159
  • 267
unknown
  • 41
  • 1
  • 3

2 Answers2

2

I implemented three variations, all have their pros and cons, but none is perfect as I wish it would. Maybe someone has better algorithms and wants to share it here?

In general, I do it like jbarlow describes. I use a ring buffer length of 2^x, where x is "large enough", e.g. 12, that would mean a maximum delay length of 2^12=4096 samples, this is ~12Hz as the lowest base frequency if rendering @ 48kHz. The reason for the power of two is that the modulo can be done by bitwise AND which is way cheaper than an actual modulo.

// init
int writepointer = 0;

// loop:
writepointer = (writepointer+1) & 0xFFF;

The writepointer is kept simple and starts e.g at 0 and increments always by 1 for each output sample.

The read pointer starts with a delta relative to the write pointer, calculated freshly everytime the frequency should change.

// init
float delta = samplingrate/frequency;
int readpointer = (writepointer-(int)delta)-1) & 0xFFF;
float frac = delta-(int)delta;
weight_a = frac;
weight_b = (1.0-frac);

// loop:
readpointer = (readpointer + 1) & 0xFFF;

It also increments by 1, but lies usually more or less between two integer positions. We use the down-rounded position to store in the integer readpointer. The weight between this and the next samples is weight_a and _b.

Variation #1: Ignore the fractional part and tread the (integer) read pointer as-is.

Pros: side-effect-less, perfect delay (no implicit low pass due to the delay, means full control over the frequency response, no artefacts)

Cons: the base frequency is mostly slightly off, quantized to integer positions. This sounds very detuned for high pitch notes and cannot make subtile pitch changes.

Variation #2: Linear interpolate between the readpointer sample and the next sample. Means I read actually two consecutive samples from the ring buffer and sum them up, weighted by weight_a and weight_b respectively.

Pros: perfect base freqeuncy, no artefacts

Cons: The linear interpolation introduces a low-pass filter that may not be desired. Even worse, the low-pass varries depending on the pitch. If the fractional part turns out to be close to 0 or 1, there is only few low-pass filtering going on, while the fractional part being around 0.5 does heavy low pass filtering. That makes some notes of the instrument being brighter than others, and it can never be brighter than this low pass allows. (bad for steel guitar or harpsichord)

Variation #3: Kind of jittering. I read the delay always from an integer position, but keep track of the error I do, means there is a variable that summs the fractional part up. Once it exceeds 1, I substract 1.0 from the error, and read the delay from the second position.

Pros: perfect base frequency, no implicit low pass

Cons: introduces audible artefacts that make it sound low-fi. (like downsampling with nearest neighbour).

Conclusion: None of the variations is satisfying. Either you cannot have the correct pitch, a neutral frequency response or you introduce artefacts.

I read in literature that an all-pass filter should do it better, but isn't the delay line an allpass already? What would be the difference in implementation?

Thilo Köhler
  • 3,631
  • 2
  • 18
  • 10
2

Digital signal processing algorithms are often represented as a block diagrams for good reason -- it is an excellent way to think about them. When coding them, think of each block of as a separate unit with fixed inputs and outputs. I think some of your questions come from trying to prematurely combine the various elements of the system.

Here is a block diagram for Karplus Strong.

Wikipedia Karplus Strong block diagram

For the delay block, you need to implement a fractional delay line. This will include its own lowpass filter, but that is a detail of how the delay line is implemented. The Karplus Strong effect also requires a lowpass filter. The characteristics of these filters will be different. Don't try to combine. By the way, the averaging lowpass filter you have select has a poor frequency response that introduces a "comb filter"-effect. You might want to design a more sophisticated FIR or IIR filter.

So how would I go about implementing this? Do I feed the interpolated value back in or what? Do I get rid of the two point averaging low pass filter completely?

You do feed the interpolated, summed sample back in to the delay line, just like the block diagram shows. In some cases this can start to increase the net gain of the system, and you might need to "normalize" the output of the delay so that it does not get out of control, if that's what you're worried about.

There are many valid strategies for implementing a fractional delay line, including interpolation and allpass filtering as you mention. The idea is that you will want to maintain read and write indexes into the delay line. The length of delay line is not the total length of the memory buffer, but the difference between the indexes modulo the total length of the delay line. Make the delay line as big as it needs to be and don't worry about resizing it.

I find it most convenient to treat read and write as a free running counters that never wrap around or expire, because then

current_delay_length = (write - read) % total_delay_length
current_read_sample = delay_line[read % total_delay_length]

where % is modulus. The write and read counters could also contain the fractional length if they are floating point values or set up as fixed point. In any case, this makes it easy to modify the length of the delay line. It is important to ensure that a minimum delay is enforced (write > read).

Believe it or not, you will change the delay line length by changing the rate you step through it, just like a fixed length buffer. Generally you will modulate the read index a little bit. It should never fall behind the write pointer more than a buffer length or get ahead of it, or you will get glitches. But you are free to move the read pointer anywhere in the wake of the write pointer. Changing the modulation will get different effects.

I stress that effects such as glissando come from how the delay line's read and write indexes are manipulated, not how it is implemented. You will get similar sounds from an allpass filter or a linearly interpolated delay line. Better fractional delay lines will reduce aliasing noise and support more rapid changes of read pointer, for example.

jbarlow
  • 1,500
  • 14
  • 21