4

I am writing code for an embedded processor (ARM Cortex-M4)

The purpose of this code is to decode 4-bit ADPCM in Intel/DVI format (also called IMA format). I have encoded an ADPCM sample of a square wave using Python's audioop module. I have then decoded this sample successfully using the same audioop module, and it is a good match for the input.

However, I am unable to decode the input data correctly on my embedded processor. The valpred value, which represents the output, seems to run-away and oscillate between a large positive and large negative value. This seems to be driven by the behaviour of the sign value. The problem I have, is that this code is effectively a carbon copy of the C implementation code of audioop, with Python parts removed. The algorithm is, as far as I can tell, identical. Yet it still seems to go into an oscillatory state, for virtually every input data value. This is clearly driven by sign flipping the vpdiff value but I can't see how this would be avoided, given the quantization step is so high (at max step 88 typically) and the data does seem to have alternating signs.

This is the implementation I am working with now. The adpcm_step_size array contains the quantization steps (e.g. 7, 8, 9 ... 29794, 32767), whereas adpcm_step_size_adapt contains the step increments (-1, -1, -1, -1, 2, 4, 6, 8, duplicated).

void audio_adpcm_play(uint8_t *sample_data, uint16_t sample_size)
{
    int sign, delta, step, vpdiff, valpred, index, half;
    uint32_t debug_data;
    uint32_t result;
    uint8_t data = 0x00;

    // Initial state
    half = 0;
    valpred = 0;
    index = 0;
    step = adpcm_step_size[index];

    while(sample_size > 0) {
        // Extract the appropriate word
        if(half) {
            delta = data & 0x0f;
        } else {
            data = *sample_data++;
            delta = (data >> 4) & 0x0f;
            sample_size--;
        }

        half = !half;
        debug_data = delta;

        // Find new index value
        index += adpcm_step_size_adapt[delta];
        if(index < 0)
            index = 0;
        if(index > 88)
            index = 88;

        // Separate sign and magnitude
        sign = delta & 8;
        delta = delta & 7;

        // Compute difference and the new predicted value
        vpdiff = step >> 3;

        if(delta & 4)
            vpdiff += step;
        if(delta & 2)
            vpdiff += step >> 1;
        if(delta & 1)
            vpdiff += step >> 2;

        if(sign)
            valpred -= vpdiff;
        else
            valpred += vpdiff;

        // Clamp values that exceed the valid range
        if(valpred > 32767)
            valpred = 32767;
        else if(valpred < -32768)
            valpred = -32768;

        step = adpcm_step_size[index];

        result = (valpred + 32767) >> AUDIO_CODE_SHIFT;
        uart_printf(DBG_LVL_INFO, \
                "data=%02x,  source_byte=%02x,  samples_rem=%5d,  valpred=%7d,  vpdiff=%5d,  sign=%02x,  delta=%02x,  index=%3d,  step=%3d,  adapt=%3d,  res=%5d/%5d\r\n", \
                debug_data, data, sample_size, valpred, vpdiff, sign, delta, index, step, \
                adpcm_step_size_adapt[delta], result, AUDIO_CODE_DUTY_MAX);
    }
}

Here's the output from inputting a square wave input; as can be seen, valpred rapidly oscillates between two values, when it should settle at a given value.

data=07,  source_byte=f7,  samples_rem= 7999,  valpred=     19,  vpdiff=   30,  sign=00,  delta=07,  index= 16,  step= 34,  adapt=  8,  res=  128/  256
data=0f,  source_byte=f7,  samples_rem= 7998,  valpred=    -44,  vpdiff=   63,  sign=08,  delta=07,  index= 24,  step= 73,  adapt=  8,  res=  127/  256
data=07,  source_byte=f7,  samples_rem= 7998,  valpred=     92,  vpdiff=  136,  sign=00,  delta=07,  index= 32,  step=157,  adapt=  8,  res=  128/  256
data=0f,  source_byte=f7,  samples_rem= 7997,  valpred=   -201,  vpdiff=  293,  sign=08,  delta=07,  index= 40,  step=337,  adapt=  8,  res=  127/  256
data=07,  source_byte=f7,  samples_rem= 7997,  valpred=    430,  vpdiff=  631,  sign=00,  delta=07,  index= 48,  step=724,  adapt=  8,  res=  129/  256
data=0f,  source_byte=f7,  samples_rem= 7996,  valpred=   -927,  vpdiff= 1357,  sign=08,  delta=07,  index= 56,  step=1552,  adapt=  8,  res=  124/  256
data=07,  source_byte=f7,  samples_rem= 7996,  valpred=   1983,  vpdiff= 2910,  sign=00,  delta=07,  index= 64,  step=3327,  adapt=  8,  res=  135/  256
data=0f,  source_byte=f7,  samples_rem= 7995,  valpred=  -4253,  vpdiff= 6236,  sign=08,  delta=07,  index= 72,  step=7132,  adapt=  8,  res=  111/  256
data=07,  source_byte=f7,  samples_rem= 7995,  valpred=   9119,  vpdiff=13372,  sign=00,  delta=07,  index= 80,  step=15289,  adapt=  8,  res=  163/  256
data=0d,  source_byte=d5,  samples_rem= 7994,  valpred= -11903,  vpdiff=21022,  sign=08,  delta=05,  index= 84,  step=22385,  adapt=  4,  res=   81/  256
data=05,  source_byte=d5,  samples_rem= 7994,  valpred=  18876,  vpdiff=30779,  sign=00,  delta=05,  index= 88,  step=32767,  adapt=  4,  res=  201/  256
data=0b,  source_byte=b3,  samples_rem= 7993,  valpred=  -9793,  vpdiff=28669,  sign=08,  delta=03,  index= 87,  step=29794,  adapt= -1,  res=   89/  256
data=03,  source_byte=b3,  samples_rem= 7993,  valpred=  16276,  vpdiff=26069,  sign=00,  delta=03,  index= 86,  step=27086,  adapt= -1,  res=  191/  256
data=0c,  source_byte=c4,  samples_rem= 7992,  valpred= -14195,  vpdiff=30471,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7992,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=09,  source_byte=9c,  samples_rem= 7991,  valpred=  10381,  vpdiff=12286,  sign=08,  delta=01,  index= 87,  step=29794,  adapt= -1,  res=  168/  256
data=0c,  source_byte=9c,  samples_rem= 7991,  valpred= -23137,  vpdiff=33518,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7990,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7990,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7989,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7989,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7988,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7988,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7987,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7987,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7986,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7986,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7985,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7985,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7984,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7984,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=01,  source_byte=14,  samples_rem= 7983,  valpred= -10851,  vpdiff=12286,  sign=00,  delta=01,  index= 87,  step=29794,  adapt= -1,  res=   85/  256
data=04,  source_byte=14,  samples_rem= 7983,  valpred=  22667,  vpdiff=33518,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7982,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7982,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7981,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7981,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7980,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7980,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7979,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7979,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7978,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7978,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7977,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7977,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7976,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7976,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=09,  source_byte=9c,  samples_rem= 7975,  valpred=  10381,  vpdiff=12286,  sign=08,  delta=01,  index= 87,  step=29794,  adapt= -1,  res=  168/  256
data=0c,  source_byte=9c,  samples_rem= 7975,  valpred= -23137,  vpdiff=33518,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7974,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256

If I take only every second sample, it almost works acceptably for square waves, but problems occur for other waveforms. That's still not an acceptable solution, but perhaps it's a clue as to the cause of the issue.

If anyone has any ideas, I'd appreciate it. I've been tearing my hair out on this for a good few days.

Edit: Source for the audioop module can be found here https://github.com/python/cpython/blob/master/Modules/audioop.c, the ADPCM decoder is audioop_adpcm2lin_impl.

too honest for this site
  • 12,050
  • 4
  • 30
  • 52
Tom Oldbury
  • 209
  • 1
  • 6
  • Is the square wave so high-frequent that two adjacent samples really should have opposing signs that often? The "startup" (before it oscillates) flips between the signs, too. – unwind Mar 13 '18 at 11:13
  • @unwind The square wave is 500Hz, sampled at 8kHz, so I don't think so, unfortunately. The initial startup seems to be similar to the Python implementation, minus the oscillation. – Tom Oldbury Mar 13 '18 at 11:16
  • You could debug the original source and compare each value with your implementation up to first difference. Btw. What size has `int` on your platforms? – jeb Mar 13 '18 at 12:01
  • @jeb I am looking at doing that now. I believe int is 32 bits (on ARM). – Tom Oldbury Mar 13 '18 at 14:27
  • It could be a problem of an overflow. In python numbers are _unlimited_ (2 to the power of 100 is no problem there) – jeb Mar 13 '18 at 14:47
  • @jeb The Python implementation is in C, though; so the ints should be 64-bit. And none of the values get anywhere near that. – Tom Oldbury Mar 13 '18 at 14:51
  • Never make any assumptions about the size of `int`, just remove it from your code. Use `int32_t` instead. – Lundin Mar 13 '18 at 14:54
  • @lundin I did use int32_t for some tests, but it didn't work. Afraid that somehow, for some bizarre reason, int was different to int32_t, I changed it back, but no difference was observed. – Tom Oldbury Mar 13 '18 at 15:21
  • Is your input data correct - does it really alternate between 7 and f as at the start and 4 and c at the end? – DisappointedByUnaccountableMod Mar 14 '18 at 17:37
  • I'd extract the audioop c code and write a test harness around it, first make sure it works as you expect with your data, then instrument it and compare the values after each sample – DisappointedByUnaccountableMod Mar 14 '18 at 18:23
  • There's another C implementation here which looks very similar (but presumably works :-) http://ww1.microchip.com/downloads/en/AppNotes/00643b.pdf – DisappointedByUnaccountableMod Mar 14 '18 at 18:26
  • Using the `>>` operator on a signed integer (e.g. `vpdiff += step >> 2;`) is implementation defined. This might be the source of your problem. – D Krueger Mar 14 '18 at 18:27

1 Answers1

0

I managed to fix this issue. It was caused by a silly error, reading the 16-bit input data one byte at a time, then decompressing the data using the same error produced a correct result in Python. But this was obviously no good for the C implementation of the decoder.

In hindsight, I'm not sure why I didn't notice that the audio file was twice as big as it should have been.

Tom Oldbury
  • 209
  • 1
  • 6