4

What is the most efficient way of generating a sine wave for a device running IOS. For the purposes of the exercise assume a frequency of 440Hz and a sampling rate of 44100Hz and 1024 samples.

A vanilla C implementation looks something like.

#define SAMPLES 1024
#define TWO_PI (3.14159 * 2)
#define FREQUENCY 440
#define SAMPLING_RATE 44100

int main(int argc, const char * argv[]) {
    float samples[SAMPLES];

    float phaseIncrement = TWO_PI * FREQUENCY / SAMPLING_RATE;
    float currentPhase = 0.0;
    for (int i = 0; i < SAMPLES; i ++){
        samples[i] = sin(currentPhase);
        currentPhase += phaseIncrement;
    }

    return 0;
}

To take advantage of the Accelerate Framework and the vecLib vvsinf function the loop can be changed to only do the addition.

#define SAMPLES 1024
#define TWO_PI (3.14159 * 2)
#define FREQUENCY 440
#define SAMPLING_RATE 44100

int main(int argc, const char * argv[]) {
    float samples[SAMPLES] __attribute__ ((aligned));
    float results[SAMPLES] __attribute__ ((aligned));

    float phaseIncrement = TWO_PI * FREQUENCY / SAMPLING_RATE;
    float currentPhase = 0.0;
    for (int i = 0; i < SAMPLES; i ++){
        samples[i] = currentPhase;
        currentPhase += phaseIncrement;
    }
    vvsinf(results, samples, SAMPLES);

    return 0;
}

But is just applying the vvsinf function as far as I should go in terms of efficiency?

I don't really understand the Accelerate framework well enough to know if I can also replace the loop. Is there a vecLib or vDSP function I can use?

For that matter is it possible to use an entirely different alogrithm to fill a buffer with a sine wave?

TJA
  • 2,969
  • 2
  • 25
  • 32

2 Answers2

5

Given that you are computing the sine of a phase argument which increases in fixed increments, it is generally much faster to implement the signal generation with a recurrence equation as described in this "How to Create Oscillators in Software" post and some more in this "DSP Trick: Sinusoidal Tone Generator" post, both on dspguru:

y[n] = 2*cos(w)*y[n-1] - y[n-2]

Note that this recurrence equation can be subject to numerical roundoff error accumulation, you should avoid computing too many samples at a time (your selection of SAMPLES == 1024 should be fine). This recurrence equation can be used after you have obtained the first two values y[0] and y[1] (the initial conditions). Since you are generating with an initial phase of 0, those are simply:

samples[0] = 0;
samples[1] = sin(phaseIncrement); 

or more generally with an arbitrary initial phase (particularly useful to reinitialize the recurrence equation every so often to avoid the numerical roundoff error accumulation I mentioned earlier):

samples[0] = sin(initialPhase);
samples[1] = sin(initialPhase+phaseIncrement); 

The recurrence equation can then be implemented directly with:

float scale = 2*cos(phaseIncrement);
// initialize first 2 samples for the 0 initial phase case
samples[0] = 0;
samples[1] = sin(phaseIncrement);     
for (int i = 2; i < SAMPLES; i ++){
    samples[i] = scale * samples[i-1] - samples[i-2];
}

Note that this implementation could be vectorized by computing multiple tones (each with the same frequency, but with larger phase increments between samples) with appropriate relative phase shifts, then interleaving the results to obtain the original tone (e.g. computing sin(4*w*n), sin(4*w*n+w), sin(4*w*n+2*w) and sin(4*w*n+3*w)). This would however make the implementation a lot more obscure, for a relatively small gain.

Alternatively the equation can be implemented by making use of vDsp_deq22:

// setup dummy array which will hold zeros as input
float nullInput[SAMPLES];
memset(nullInput, 0, SAMPLES * sizeof(float));

// setup filter coefficients
float coefficients[5];
coefficients[0] = 0;
coefficients[1] = 0;
coefficients[2] = 0;
coefficients[3] = -2*cos(phaseIncrement);
coefficients[4] = 1.0;

// initialize first 2 samples for the 0 initial phase case
samples[0] = 0;
samples[1] = sin(phaseIncrement); 
vDsp_deq22(nullInput, 1, coefficients, samples, 1, SAMPLES-2);
SleuthEye
  • 14,379
  • 2
  • 32
  • 61
  • Thanks for the answer. I've seen various code examples that reset, I think, every TWO_PI samples by including some kind of check in the loop. However as we are aiming for efficiency would it be practical to generate (N * TWO_PI * FREQUENCY / SAMPLING_RATE) samples at a time. And repeat until the buffer is filled with the total number of samples required? – TJA Jan 26 '16 at 05:15
  • The reset every 2pi is common in implementations based on lookup tables (to keep the index within the table), or even in implementations relying directly on the phase to reduce floating point inaccuracies with very large phase angles. Since this post doesn't directly use the phase in the loop you don't have that problem. You do however have error accumulation (as I mentioned in the post), which can indeed be resolved by generating samples in blocks (though the size of the block doesn't have to be related to N*TWO_PI*FREQUENCY/SAMPLING_RATE) and repeat until the buffer is filled. So, yes. – SleuthEye Jan 26 '16 at 05:44
1

If efficiency is required, you could pre-load a 440hz (44100 / 440) sine waveform look-up table and loop around it without further mapping or pre-load a 1hz (44100 / 44100) sine waveform look-up table and loop around by skipping samples to reach 440hz just as you did by incrementing a phase counter. Using look-up tables should be faster than computing sin().

Method A (using 440hz sine waveform):

#define SAMPLES 1024
#define FREQUENCY 440
#define SAMPLING_RATE 44100
#define WAVEFORM_LENGTH (SAMPLING / FREQUENCY)

int main(int argc, const char * argv[]) {
    float waveform[WAVEFORM_LENGTH];
    LoadSinWaveForm(waveform);

    float samples[SAMPLES] __attribute__ ((aligned));
    float results[SAMPLES] __attribute__ ((aligned));

    for (int i = 0; i < SAMPLES; i ++){
        samples[i] = waveform[i % WAVEFORM_LENGTH];
    }

    vvsinf(results, samples, SAMPLES);

    return 0;
}

Method B (using 1hz sine waveform):

#define SAMPLES 1024
#define FREQUENCY 440
#define TWO_PI (3.14159 * 2)
#define SAMPLING_RATE 44100
#define WAVEFORM_LENGTH SAMPLING_RATE // since it's 1hz

int main(int argc, const char * argv[]) {
    float waveform[WAVEFORM_LENGTH];
    LoadSinWaveForm(waveform);

    float samples[SAMPLES] __attribute__ ((aligned));
    float results[SAMPLES] __attribute__ ((aligned));

    float phaseIncrement = TWO_PI * FREQUENCY / SAMPLING_RATE;
    float currentPhase = 0.0;
    for (int i = 0; i < SAMPLES; i ++){
        samples[i] = waveform[floor(currentPhase) % WAVEFORM_LENGTH];
        currentPhase += phaseIncrement;
    }

    vvsinf(results, samples, SAMPLES);

    return 0;
}

Please note that:

  • Method A is susceptible to frequency inaccuracy due to assuming that your frequency always divides correctly the sampling rate, which is not true. That means you may get 441hz or 440hz with a glitch.

  • Method B is susceptible to aliasing as the frequency goes up an gets closer to the Nyquist frequency, but it's a good trade-off between performance, quality and memory consumption if synthesizing reasonable low frequencies such as the one in your example.

  • Thanks Alex. The efficiency is part obsessive compulsive behaviour on my part and part premature optimisation. I fully intend to pre generate the buffers as a once off background initialisation process. But I want to do it efficiently as later on I'm looking at being able to change from an A4 reference pitch of 440Hz to say 432Hz and don't want a noticeable pause as the buffer list gets generated. – TJA Feb 26 '16 at 09:12
  • Then you only have to do it once. Generate the 1hz sine, essentially using method B. – Alex Tuduran Apr 05 '16 at 16:46