Why JPEG compression processes image by 8x8 blocks instead of applying Discrete Cosine Transform to the whole image?
-
Consider that the DCT is good at compressing rather smooth areas with low frequency content, but quite bad at compressing high frequency content areas. – Dan D. May 28 '12 at 06:57
-
identical question: http://stackoverflow.com/questions/11147666/why-do-image-compression-algorithms-process-the-image-by-sub-blocks/11147696#11147696 – ctrl-alt-delor Jun 22 '12 at 09:35
-
Earlier identical question: http://stackoverflow.com/questions/74892/is-there-a-quality-file-size-or-other-benefit-to-jpeg-sizes-being-multiples-of – David Cary Jun 23 '12 at 04:38
-
Does this answer your question? [Why do image compression algorithms process the image by sub-blocks?](https://stackoverflow.com/questions/11147666/why-do-image-compression-algorithms-process-the-image-by-sub-blocks) – Björn Lindqvist Sep 01 '22 at 16:07
3 Answers
8 X 8 was chosen after numerous experiments with other sizes.
The conclusions of experiments are: 1. Any matrices of sizes greater than 8 X 8 are harder to do mathematical operations (like transforms etc..) or not supported by hardware or take longer time. 2. Any matrices of sizes less than 8 X 8 dont have enough information to continue along with the pipeline. It results in bad quality of the compressed image.

- 1,692
- 3
- 21
- 35
-
-
Then it stands to reason that in an updated spec, 16x16 blocks or 32x32 blocks would be more efficient. – Björn Lindqvist Sep 01 '22 at 16:09
Because, that would take "forever" to decode. I don't remember fully now, but I think you need at least as many coefficients as there are pixels in the block. If you code the whole image as a single block I think you need to, for every pixel, iterate through all the DCT coefficients.
I'm not very good at big O calculations but I guess the complexity would be O("forever"). ;-)
For modern video codecs I think they've started using 16x16 blocks instead.

- 6,514
- 8
- 32
- 37
-
3If you need to iterate on everything every iteration, it's O(n^2), not "forever", which is O(n!). – Triang3l Dec 19 '12 at 10:45
-
I'm not sure about "forever" - I have a program, called "FourierPainter" and it performs the Fourier transformation and back to image in a split second. The whole image, mind you. And Fourier is not much different from DCT, makes slightly more calculations even (keep imaginary values as phases, whilst DCT doesn't). You can download the Fourier Painter yourself and check it: http://www.jcrystal.com/products/fp/ – shal May 02 '18 at 12:36
-
Not really. It takes O(N log N) to do a full DCT, and O(N log M) with fixed M<
– paperclip optimizer Oct 01 '22 at 17:42
One good reason is that images (or at least the kind of images humans like to look at) have a high degree of information correlation locally, but not globally.
Every relatively smooth patch of skin, or piece of sky or grass or wall eventually ends in a sharp edge and is replaced by something entirely different. This means you still need a high frequency cutoff in order to represent the image adequately rather than just blur it out.
Now, because Fourier-like transforms like DCT "jumble" all the spacial information, you wouldn't be able to throw away any intermediate coefficients either, nor the high-frequency components "you don't like".
There are of course other ways to try to discard visual noise and reconstruct edges at the same time by preserving high frequency components only when needed, or do some iterative reconstruction of the image at finer levels of detail. You might want to look into space-scale representation and wavelet transforms.

- 166
- 7