16

I have a buffered image in java and I want to record how similar each pixel is to another based on the color value. so the pixels with 'similar' colors will have a higher similarity value. for example red and pink will have a similarity value 1000 but red and blue will have something like 300 or less.

how can I do this. when I get the RGB from a buffered Image pixel it returns a negative integer I am not sure how to implement this with that.

Cœur
  • 37,241
  • 25
  • 195
  • 267
anon
  • 697
  • 3
  • 14
  • 19

12 Answers12

24

First, how are you getting the integer value?

Once you get the RGB values, you could try

((r2 - r1)2 + (g2 - g1)2 + (b2 - b1)2)1/2

This would give you the distance in 3D space from the two points, each designated by (r1,g1,b1) and (r2,g2,b2).

Or there are more sophisticated ways using the HSV value of the color.

lavinio
  • 23,931
  • 5
  • 55
  • 71
  • I think you've left out the ^2 after (b2-b1); anyway, +1 because I was about to post the same – Erich Kitzmueller Nov 12 '09 at 21:28
  • Is that really how you measure distance in 3d space? I would have thought it would involve a square root somewhere? if this works, then it could be used in 2d space, and you've just outsmarted pythagoras. – Breton Nov 12 '09 at 21:51
  • 8
    He's taking the cube root, which is incorrect; it should be the square root. But taking the root is unnecessary, since you can compare squares of distances as easily as the distances themselves, and save the time of taking the root. – David R Tribble Nov 12 '09 at 22:08
  • This can be used to solve the issue of the OSX color matching discrepancies of the Robot.getPixelColor(int x, int y). – Charles Mosndup Mar 04 '19 at 08:36
11

HSL is a bad move. L*a*b is a color space designed to represent how color is actually percieved, and is based on data from hundreds of experiments involving people with real eyes looking at different colors and saying "I can tell the difference between those two. But not those two".

Distance in L*a*b space represents actual percieved distance according to the predictions derived from those experiments.

Once you convert into L*a*b you just need to measure linear distance in a 3D space.

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Breton
  • 15,401
  • 3
  • 59
  • 76
9

I suggest you start reading here

Color difference formulas if you want to do this right. It explains the ΔE*ab, ΔE*94, ΔE*00 and ΔE*CMC formulas for calculating color difference.

jitter
  • 53,475
  • 11
  • 111
  • 124
6

If you are going to use HSV you need to realize that HSV are not points in a three dimensional space but rather the angle, magnitude, and distance-from-top of a cone. To calculate the distance of an HSV value you either need to determine your points in 3d space by transforming.

X = Cos(H)*S*V

Y = Sin(H)*S*V

Z = V

For both points and then taking the Euclidian distance between them:

Sqrt((X0 - X1)*(X0 - X1) + (Y0 - Y1)*(Y0 - Y1) + (Z0 - Z1)*(Z0 - Z1))

At a cost of 2 Cos, 2 Sin, and a square root.

Alternatively you can actually calculate distance a bit more easily if you're so inclined by realizing that when flattened to 2D space you simply have two vectors from the origin, and applying the law of cosign to find the distance in XY space:

C² = A² + B² + 2*A*B*Cos(Theta)

Where A = S*V of the first value, and B = S*V of the second and cosign is the difference theta or H0-H1

Then you factor in Z, to expand the 2D space into 3D space.

A = S0*V0
B = S1*V1
dTheta = H1-H0
dZ = V0-V1
distance = sqrt(dZ*dZ + A*A + B*B + 2*A*B*Cos(dTheta);

Note that because the law of cosigns gives us C² we just plug it right in there with the change in Z. Which costs 1 Cos and 1 Sqrt. HSV is plenty useful, you just need to know what type of color space it's describing. You can't just slap them into a euclidian function and get something coherent out of it.

Tatarize
  • 10,238
  • 4
  • 58
  • 64
2

The easiest is to convert both colours to HSV value and find the difference in H values. Minimal changes means the colours are similar. It's up to you to define a threshold though.

EmFi
  • 23,435
  • 3
  • 57
  • 68
2

You're probably calling getRGB() on each pixel which is returning the color as 4 8 bits bytes, the high byte alpha, the next byte red, the next byte green, the next byte blue. You need to separate out the channels. Even then, color similarity in RGB space is not so great - you might get much better results using HSL or HSV space. See here for conversion code.

In other words:

int a = (argb >> 24) & 0xff;
int r = (argb >> 16) & 0xff;
int g = (argb >> 8) & 0xff;
int b = argb & 0xff;

I don't know the specific byte ordering in java buffered images, but I think that's right.

plinth
  • 48,267
  • 11
  • 78
  • 120
  • 1
    -1 for reccomending HSL or HSV. Those are ad hoc transformations that don't really mean anything in the real world. they were invented for early graphics programs. distance in L*a*b space is based on measurements from hundreds of experiments with real eyes and real colors. – Breton Nov 12 '09 at 21:45
2

You could get the separate bytes as follows:

int rgb = bufferedImage.getRGB(x, y); // Returns by default ARGB.
int alpha = (rgb >>> 24) & 0xFF;
int red = (rgb >>> 16) & 0xFF;
int green = (rgb >>> 8) & 0xFF;
int blue = (rgb >>> 0) & 0xFF;
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
1

I find HSL values easier to understand. HSL Color explains how they work and provides the conversion routines. Like the other answer you would need to determine what similiar means to you.

camickr
  • 321,443
  • 19
  • 166
  • 288
1

There's an interesting paper on exactly this problem:

A New Perceptually Uniform Color Space with Associated Color Similarity Measure for Content-Based Image and Video Retrieval by M. Sarifuddin and Rokia Missaoui

You can find this easily using Google or in particular [Google Scholar.][1]

To summarise, some color spaces (e.g. RGB, HSV, Lab) and distance measures (such as Geometric mean and Euclidean distance) are better representations of human perception of color similarity than others. The paper talks about a new color space, which is better than the rest, but it also provides a good comparison of the common existing color spaces and distance measures. Qualitatively*, it seems the best measure for perceptual distance using commonly available color spaces is : the HSV color space and a cylindrical distance measure.

*At least, according to Figure 15 in the referenced paper.

The cylindrical distance measure is (in Latex notation):

D_{cyl} = \sqrt{\Delta V^{2}+S_1^{2}+S_2^{2}-2S_1S_2cos(\Delta H)}

Tim Gee
  • 1,062
  • 7
  • 9
  • 2
    Having read the paper and implemented the algorithm. It's basically crap. It's not horrible but it's really not better than HSV. The derivation of the C and L variables are fine, the paper lists hue incorrectly (if you used the values as given green=red). The closeness of figure 15 is due to a flawed methodology. It rejected colors until it had enough to fill that area. So any colorspace is being graded on the threshold of rejection it has been given and *NOT* it's ability to properly reflect what the average human would gauge as the most different colors. And this isn't even all the mistakes. – Tatarize Sep 03 '12 at 06:48
0

This is a similar question to #1634206.

If you're looking for the distance in RGB space, the Euclidean distance will work, assuming you treat red, green, and blue values all equally.

If you want to weight them differently, as is commonly done when converting color/RGB to grayscale, you need to weight each component by a different amount. For example, using the popular conversion from RGB to grayscale of 30% red + 59% green + 11% blue:

d2 = (30*(r1-r2))**2 + (59*(g1-g2))**2 + (11*(b1-b2))**2;

The smaller the value of d2, the closer the colors (r1,g1,b1)and(r2,g2,b2) are to each other.

But there are other color spaces to choose from than just RGB, which may be better suited to your problem.

Community
  • 1
  • 1
David R Tribble
  • 11,918
  • 5
  • 42
  • 52
  • The weights there are gamma correction weights. While the blues (11) do not make a large contribution to human perception of brightness they do actually make a large discrimination in human determination of color distance. http://www.compuphase.com/cmetric.htm The weights 2,4,3 are better for the stated purpose. I ran LAB (delta E) through every permutation of color distance (2^24 * 2^24) and averaged them. And the weights 22, 43, 35 are better (using LAB as a surrogate for human eyes). Also, I've implemented that weighted Euclidean algorithm, it sucks! – Tatarize Sep 03 '12 at 06:55
0

Color perception is not linear because the human eye is more sensitive to certain colors than others.

So jitter answered correctly

Community
  • 1
  • 1
Pylyp
  • 404
  • 1
  • 7
  • 15
-2

i tried it out. the HSL/HSV value is definitely not useful. for instance:

  • all colors with L=0 are 'black' (RGB 000000), though their HSL difference may implicate a high color distance.

  • all colors with S=0 are a shade of 'gray', though their HSL difference may implicate a high color distance.

  • the H (hue) range begins and ends with a shade of 'red', so H=0 and H=[max] (360° or 100% or 240, depending on the application) are both red and relatively similar to each other, but the Euclidean HSL distance is close to maximum.

so my recommendation is to use the Euclidean RGB distance (r2-r1)² + (g2-g1)² + (b2-b1)² without root. the (subjective) threshold of 1000 then works fine for similar colors. colors with differences > 1000 are well distinguishable by the human eye. additionally it can be helful to weight the components differently (see prev. post).

  • Hue is an angle. You can't just toss it into Euclidean distance equation. You need to plot it as a position in space within some kind of shape (cylinder, cone), and then find the distances between those positions within that shape. As you're trying to do it you would find HSV(360°, 1, 1) and HSV(0°, 1, 1) to be HUGELY different, when really they are the exact same color (solid red). – Tatarize Sep 03 '12 at 07:04