I have an implementation of a Inverse Discrete Cosine Transform and I'm trying to figure out how they got to this code. So far, I've figured out that this is probably an optimized implementation of the Cooley-Tukey radix-2 Decimation-in-time for a DCT instead of a DFT (Discrete Fourier Transform).
However, I'm still at a loss about exactly what happens at each stage. I figured that the Wx constants are probably the twiddle factors.
Can anybody provide a reference to an explanation, or provide some explanation to this code?
//Twiddle factors
#define W1 2841 /* 2048*sqrt(2)*cos(1*pi/16) */
#define W2 2676 /* 2048*sqrt(2)*cos(2*pi/16) */
#define W3 2408 /* 2048*sqrt(2)*cos(3*pi/16) */
#define W5 1609 /* 2048*sqrt(2)*cos(5*pi/16) */
#define W6 1108 /* 2048*sqrt(2)*cos(6*pi/16) */
#define W7 565 /* 2048*sqrt(2)*cos(7*pi/16) */
//Discrete Cosine Transform on a row of 8 DCT coefficients.
NJ_INLINE void njRowIDCT(int* blk) {
int x0, x1, x2, x3, x4, x5, x6, x7, x8;
int t;
if (!((x1 = blk[4] << 11)
| (x2 = blk[6])
| (x3 = blk[2])
| (x4 = blk[1])
| (x5 = blk[7])
| (x6 = blk[5])
| (x7 = blk[3])))
{
blk[0] = blk[1] = blk[2] = blk[3] = blk[4] = blk[5] = blk[6] = blk[7] = blk[0] << 3;
return;
}
x0 = (blk[0] << 11) + 128; //For rounding at fourth stage
//First stage
/*What exactly are we doing here? Do the x values have a meaning?*/
x8 = W7 * (x4 + x5);
x4 = x8 + (W1 - W7) * x4;
x5 = x8 - (W1 + W7) * x5;
x8 = W3 * (x6 + x7);
x6 = x8 - (W3 - W5) * x6;
x7 = x8 - (W3 + W5) * x7;
//Second stage
x8 = x0 + x1;
x0 -= x1;
x1 = W6 * (x3 + x2);
x2 = x1 - (W2 + W6) * x2;
x3 = x1 + (W2 - W6) * x3;
x1 = x4 + x6;
x4 -= x6;
x6 = x5 + x7;
x5 -= x7;
//Third stage
x7 = x8 + x3;
x8 -= x3;
x3 = x0 + x2;
x0 -= x2;
x2 = (181 * (x4 + x5) + 128) >> 8;
x4 = (181 * (x4 - x5) + 128) >> 8;
//Fourth stage
blk[0] = (x7 + x1) >> 8; //bit shift is to emulate 8 bit fixed point precision
blk[1] = (x3 + x2) >> 8;
blk[2] = (x0 + x4) >> 8;
blk[3] = (x8 + x6) >> 8;
blk[4] = (x8 - x6) >> 8;
blk[5] = (x0 - x4) >> 8;
blk[6] = (x3 - x2) >> 8;
blk[7] = (x7 - x1) >> 8;
}