How to efficiently compute the average "direction" of pixels in a grayscale image?

Question

So I figured out I can convert an image to grayscale like this:

public static Bitmap GrayScale(this Image img)
{
    var bmp = new Bitmap(img.Width, img.Height);
    using(var g = Graphics.FromImage(bmp))
    {
        var colorMatrix = new ColorMatrix(
            new[]
                {
                    new[] {.30f, .30f, .30f, 0, 0},
                    new[] {.59f, .59f, .59f, 0, 0},
                    new[] {.11f, .11f, .11f, 0, 0},
                    new[] {0, 0, 0, 1.0f, 0},
                    new[] {0, 0, 0, 0, 1.0f}
                });

        using(var attrs = new ImageAttributes())
        {
            attrs.SetColorMatrix(colorMatrix);
            g.DrawImage(img, new Rectangle(0, 0, img.Width, img.Height),
                0, 0, img.Width, img.Height, GraphicsUnit.Pixel, attrs);
        }
    }
    return bmp;
}

Now, I want to compute the average "direction" of the pixels.

What I mean by that is that I want to look at say a 3x3 region, and then if the left side is darker than the right side, then the direction would be to the right, if the bottom is darker than the top, then the direction would be upwards, if the bottom-left is darker than the top-right, then the direction would be up-right. (Think of little vector arrows over every 3x3 region). Perhaps a better example is if you draw a grayscale gradient in photoshop, and you want to compute at what angle they drew it.

I've done stuff like this MatLab, but that was years ago. I figure I could use a matrix similar to ColorMatrix to compute this, but I'm not quite sure how. It looks like this function might be what I want; could I convert it to grayscale (as above) and then do something with the grayscale matrix to compute these directions?

IIRC, what I want is quite similar to edge detection.

After I compute these direction vectors, I'm just going to loop over them and compute the average direction of the image.

The end goal is I want to rotate images so that their average direction is always upwards; this way if I have two identical images except one is rotated (90,180 or 270 degrees), they will end up oriented the same way (I'm not concerned if a person ends up upside down).

*snip* Deleting some spam. You can view the revisions of you want to read the rest of my attempts.

I think if you compute the correlation of each image tile vs the coordinates (e.g. `meshgrid`) you should get your "direction". Also I suspect that the average of tile directions is going to give the same answer as if you correlated the entire image as a single tile. If you still have MatLab, I would use it to test your algorithm, then port the final version to C#. — Ben Voigt, Apr 15 '12 at 23:53
@BenVoigt: What's a meshgrid? I'd happily compute the direction on the image as a whole if I knew how. I don't have a copy of MatLab anymore... I only used it for a couple semesters in uni; I'm not sure I'd remember how to use it. Requires a different way of thinking. — mpen, Apr 16 '12 at 00:12
Sounds like all you want is the gradient. Have you tried that yet? — dranxo, Apr 16 '12 at 00:42
@Mark: `meshgrid` is a MatLab function. You won't need it if calculating correlation in C#. Just use the coordinate as one variable and the image intensity as the other. Actually, you may just be calculating a centroid in two dimensions. — Ben Voigt, Apr 16 '12 at 00:44
@Mark: Using the gradient of your image to compute the direction you want. You are describing this: http://en.wikipedia.org/wiki/Image_gradient in your post, right? — dranxo, Apr 16 '12 at 01:52
@rcompton: I didn't know it was actually called the "gradient". Yes, that's exactly what I want. I was hoping to get some answers telling me which C# functions would be useful in computing this, but it looks like EmguCV will do the trick. I've managed to apply the Sobel operator,...bit more work, and I should have it. — mpen, Apr 16 '12 at 05:17
If you are trying to find the perspective orientation of an image. This is going to be tough though. — , Apr 16 '12 at 07:39
@Mark you will definitely need a 'reference' image to compare and decide if the image is rotated or not. — , Apr 16 '12 at 07:42
@Wajih: Hrm? The goal isn't to determine if the image is rotated, it's just to align them in the same direction. I don't care if they come out sideways, as long as they *both* come out sideways. — mpen, Apr 16 '12 at 15:55

Niki · Accepted Answer · 2012-04-17T06:25:21.897

Calculating the mean of angles is generally a bad idea:

...
        sum += Math.Atan2(yi, xi);
    }
}
double avg = sum / (img.Width * img.Height);

The mean of a set of angles has no clear meaning: For example, the mean of one angle pointing up and one angle pointing down is a angle pointing right. Is that what you want? Assuming "up" is +PI, then the mean between two angles almost pointing up would be an angle pointing down, if one angle is PI-[some small value], the other -PI+[some small value]. That's probably not what you want. Also, you're completely ignoring the strength of the edge - most of the pixels in your real-life images aren't edges at all, so the gradient direction is mostly noise.

If you want to calculate something like an "average direction", you need to add up vectors instead of angles, then calculate Atan2 after the loop. Problem is: That vector sum tells you nothing about objects inside the image, as gradients pointing in opposite directions cancel each other out. It only tells you something about the difference in brightness between the first/last row and first/last column of the image. That's probably not what you want.

I think the simplest way to orient images is to create an angle histogram: Create an array with (e.g.) 360 bins for 360° of gradient directions. Then calculate the gradient angle and magnitude for each pixel. Add each gradient magnitude to the right angle-bin. This won't give you a single angle, but an angle-histogram, which can then be used to orient two images to each other using simple cyclic correlation.

Here's a proof-of-concept Mathematica implementation I've thrown together to see if this would work:

angleHistogram[src_] :=
 (
  Lx = GaussianFilter[ImageData[src], 2, {0, 1}];
  Ly = GaussianFilter[ImageData[src], 2, {1, 0}];
  angleAndOrientation = 
   MapThread[{Round[ArcTan[#1, #2]*180/\[Pi]], 
      Sqrt[#1^2 + #2^2]} &, {Lx, Ly}, 2];
  angleAndOrientationFlat = Flatten[angleAndOrientation, 1];
  bins = BinLists[angleAndOrientationFlat , 1, 5];
  histogram = 
   Total /@ Flatten[bins[[All, All, All, 2]], {{1}, {2, 3}}];
  maxIndex = Position[histogram, Max[histogram]][[1, 1]];
  Labeled[
   Show[
    ListLinePlot[histogram, PlotRange -> All],
    Graphics[{Red, Point[{maxIndex, histogram[[maxIndex]]}]}]
    ], "Maximum at " <> ToString[maxIndex] <> "\[Degree]"]
  )

Results with sample images:

enter image description here

The angle histograms also show why the mean angle can't work: The histogram is essentially a single sharp peak, the other angles are roughly uniform. The mean of this histogram will always be dominated by the uniform "background noise". That's why you've got almost the same angle (about 180°) for each of the "real live" images with your current algorithm.

The tree image has a single dominant angle (the horizon), so in this case, you could use the mode of the histogram (the most frequent angle). But that will not work for every image:

enter image description here

Here you have two peaks. Cyclic correlation should still orient two images to each other, but simply using the mode is probably not enough.

Also note that the peak in the angle histogram is not "up": In the tree image above, the peak in the angle histogram is probably the horizon. So it's pointing up. In the Lena image, it's the vertical white bar in the background - so it's pointing to the right. Simply orienting the images using the most frequent angle will not turn every image with the right side pointing up.

enter image description here

This image has even more peaks: Using the mode (or, probably, any single angle) would be unreliable to orient this image. But the angle histogram as a whole should still give you a reliable orientation.

Note: I didn't pre-process the images, I didn't try gradient operators at different scales, I didn't post-process the resulting histogram. In a real-world application, you would tweak all these things to get the best possible algorithm for a large set of test images. This is just a quick test to see if the idea could work at all.

Add: To orient two images using this histogram, you would

Normalize all histograms, so the area under the histogram is the same for each image (even if some are brighter, darker or blurrier)
Take the histograms of the images, and compare them for each rotation you're interested in:

For example, in C#:

for (int rotationAngle = 0; rotationAngle < 360; rotationAngle++)
{
   int difference = 0;
   for (int i = 0; i < 360; i++)
      difference += Math.Abs(histogram1[i] - histogram2[(i+rotationAngle) % 360]);
   if (difference < bestDifferenceSoFar)
   {
      bestDifferenceSoFar = difference;
      foundRotation = rotationAngle;
   }
}

(you could speed this up using FFT if your histogram length is a power of two. But the code would be a lot more complex, and for 256 bins, it might not matter that much)

Excellent answer; thank you for taking the time to write and test this. How can I use the histogram as a whole to definitively decide on an orientation? Your last example looks pretty spot on; the maximum is *exactly* 90 degrees apart, although I can see how if the maxima came out slightly differently it could have chosen a different peak. — mpen, Apr 17 '12 at 01:46
@Mark: The maxima are exactly 90° apart, because it's exactly the same image, rotated by 90°, so the gradients have exactly the same magnitude and are rotated exactly 90°. If you would resize the image, or compress it with lossy compression, the mode might well be a different peak. — Niki, Apr 17 '12 at 06:13
Yes, that was my point -- they came out *exactly* correct, not even a little off. I've ran some experiments with just rotating images 90 degrees, and the answers have always come out right. I'll try with lossy jpeg compression and resizing hopefully later today. — mpen, Apr 17 '12 at 15:29
I ran some more tests running various Photoshop filters on my images to see if it would align them the same way. I tried resizing an image down to 10%, and saving at 0/12 jpeg quality, auto-balancing the colors, desaturating the colors, super-saturing the colors, rotating the hues, gaussian blurring the image and sharpening it. The only case that failed was super-saturation. By super-saturation I mean to the point where the edges become hard and bright blue or green. Otherwise it seems to work extremely well in practice. — mpen, Apr 18 '12 at 02:23
-- Although I managed to fix these cases too with some resizing and Gaussian blurring. — mpen, Apr 18 '12 at 02:59

score 1 · Answer 2 · 2012-04-17T01:55:13.867

Well I can give you another way of doing it. Though will not be pretty but hope it works for you.

Its likely your computations are OK. Just that the gradient once average end up in a different average value other than what you expect. So I suspect by looking at the image you feel there must be a different average angle in it. Therefore;

Convert the image to binary.
Find lines using hough transform
Take the longest line and compute its angle. This should give you the angle that is most prominent.
You might need some pre/post processing to get the lines right.

And as of one more approach. Try GIST This is basically an implementation most widely used in scene recognition. I find your images to be real scenes and hence I would suggest to take this approach. This method will give you a vector that you compare against different orientation vectors of the same image. This is a very well know technique and should definitely be applicable in your case.

I think this is going to have the same problem that nikie alludes to: what if there is no predominant line? — mpen, Apr 17 '12 at 01:44
Then there will be no well defined direction. That is why you will need a reference image to find the direction. — , Apr 17 '12 at 01:48

score 0 · Answer 3 · answered Apr 16 '12 at 06:24

0

Consider using the gradient of your image to compute the direction you want: en.wikipedia.org/wiki/Image_gradient

answered Apr 16 '12 at 06:24

dranxo

3,348
4
35
48

2

That's great, but you've essentially only given me a name to what I described. Still having trouble with the implementation. – mpen Apr 16 '12 at 06:40
One implementation difficulty will come up when you average all the angles together. The gradient takes in an image and then outputs a vector field. At some places (for example, the black regions in your images) the gradient will be 0 and the averaging will output something unexpected. Consider using the mode or median instead. By the way, what is this for? It sounds like your end goal is image registration which is very nontrival cf http://l3.lcarrasco.com/2010/04/image-registration/ – dranxo Apr 16 '12 at 07:15
Nope, I'm not going that crazy. I just want to orient identical images in the same 90-degree direction. For example, photos taken with a digital camera often come out sideways, and sometimes they get uploaded to the web this way too. Given 2 otherwise identical images, I want to orient them the same way so that I can determine if they're the same image. I know that I could rotate the image a few times and do multiple comparisons, but that would be too slow. I want to pre-process the images and align them in a consistent direction to speed up future comparisons. – mpen Apr 16 '12 at 07:26
The angle between two images. Looks like this guy has a nice prescription for it that's very close to what you're doing right now but has a lot of detail: http://www.mathworks.com/matlabcentral/newsreader/view_thread/248242 It's in matlab but that's kinda the de facto standard for prototyping ideas in image processing. – dranxo Apr 16 '12 at 07:39

score 0 · Answer 4 · answered May 23 '12 at 21:45

You need to convolve your image with two Gaussian derivative kernels (one in X and one in Y). This is actually the Lx and Ly in the answer above.

Subtract beforehand the average pixel intensity before computing the summed product between the sliding window (subimage of your original image) and the first order Gaussian derivative functions.

See for example this tutorial: http://bmia.bmt.tue.nl/people/bromeny/MICCAI2008/Materials/05%20Gaussian%20derivatives%20MMA6.pdf

Choose the optimal smoothing factor sigma >= 1.

To compute the Gaussian kernels, differentiate once the 2D-Gaussian function (known from the normal distribution) with the 1d-variable '(x-0)^2' substituted by (x^2 + y^2). You can draw it in 2D, for example in MS Excel.

Good luck!

Michael

How to efficiently compute the average "direction" of pixels in a grayscale image?

4 Answers4

Linked