I have a good intuition on how does the 2-D IDCT that underlies the JPEG decoder works, especially after seeing the animation at the bottom of http://en.wikipedia.org/wiki/Discrete_cosine_transform.
I also understand that it can be formulated simply as:
However, I'm not sure I understand the intuition behind the forward DCT. It's also expressed as:
But for some reason s(x,y,u,v) = r(x,y,u,v)
. Why is it?
Also, the intuition I have for s
is: every T(u,v) is little 8X8 image, where T(0,0) is smooth, and t(8,8) is a checkerboard. The value of a pixel F(3,7)
is a linear combination of each of the value of (3,7)
in each one of those images. s(3,7,u,v)
represents this value in each image.
So for example, I can assume that s(3,7,u,v)
is positive for the little images in which (3,7)
is closer to white (255), and negative for those in which it is dark (0).
Is this a good intuition? Can you supply a similar, non-math intuition for r
?
Thanks!