0

I have a background on mathematics and Machine Learning, but I'm quite new on image compression. The other way I was thinking in the optimal way to compress an image just using a lookup table. This means, given an original image which has N unique values, change it to a new image with M unique values being M<N. Given a fixed value of M, my question was how to pick those values. I realized that if we take as figure of merit the total error (MSE) of all the pixels, all the information has to be in the histogram of the pixel intensities. Somehow, the most common values should be mapped to a closer value than the uncommon values, making the higher regions of the histogram more "dense" in the new values that the low regions.Hence I was wondering if it exists a mathematical formula that:

-Given the histogram h(x) of all the pixels intensities

-Given the number of uniques new values M

Defines the set of new M values {X_new} that minimizes the total error. I tried to define the loss function and take the derivative, but it appeared some argmax operations that I don't know how to derivate them. However, my intution tells me that it should exist a closed formula.....

Example: Say we have an image with just 10 pixels, with values {1,1,1,1,2,2,2,2,3,3}. We initially have N=3 and we are asked to select the M=2 unique values that minimizes the error. It is clear, that we have to pick the 2 most common ones, so {X_new}={1,2} and the new image will be "compressed" as {1,1,1,1,2,2,2,2,2,2}. If we are asked to pick M=1, we will pick {X_new}=2 to minimize the error.

Thanks!

1 Answers1

0

This is called color quantization or palettization. It is essentially a clustering problem, usually in the 3D RGB space. Each cluster becomes a single color in the downsampled image. The GIF and PNG image formats both support palettes.

There are many clustering algorithms out there, with a lot of research behind them. For this, I would first try k-means and DBSCAN.

Note that palettization would only be one part of an effective image compression approach. You would also want to take advantage of both the spatial correlation of pixels (often done with a 2-D spatial frequency analysis such as a discrete cosine transform or wavelet transform), as well as taking advantage of the lower resolution of the human eye in color discrimination as opposed to grayscale acuity.

Unless you want to embark on a few years of research to improve the state of the art, I recommend that you use existing image compression algorithms and formats.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • Thanks a lot. Your answer is actually really helpful. Agree on the fact that I should use current aproaches, I was just wondering if there are any "closed form" solutions since my intuition was telling me that it should be. Given that I need to use a clustering like k-means or DBSCAN, it seems like there is no optimal solution form since typically those algorithms just guanrantee local minimums not global. Am I right? – markk borras Aug 09 '21 at 18:08