10

Suppose 5 samples of hue are taken using a simple HSV model for color, having values 355, 5, 5, 5, 5, all a hue of red and "next" to each other as far as perception is concerned. But the simple average is 75 which is far away from 0 or 360, close to a yellow-green.

What is a better way to calculate this mean and associated std?

drb
  • 728
  • 8
  • 21
  • After considerably more digging around with Google, found a link back to stackoverflow from a post actually discussing average wind direction: http://stackoverflow.com/questions/491738/how-do-you-calculate-the-average-of-a-set-of-angles/3651941#3651941 But it doesn't address the issue of standard deviation. – drb Nov 17 '11 at 15:52
  • Once you've got a mean you're happy with, you can just calculate the standard deviation from the mean deviations, right? – AakashM Nov 17 '11 at 16:32
  • @AakashM, I'm still trying to figure this out. I know code. Statistics I'm a bit more hazy on. – drb Nov 17 '11 at 16:41
  • On several websites where I see this question trying to be answered, I also see contrived data sets, such as two entries for 270 and 90, which then say that the average is meaningless. In order to avoid this, here is a small sample of actual values with which I'm working: (naive mean and std are: 185.658 174.848) 347.059 0 359.059 347 354.05 353.012 13.012 358.118 8.06723 354.118 0.967742 0.97561 351.074 8.06324 346.098 0.941176 1.88235 355.082 6.93227 359.059 1.88235 358.088 0.97166 0.983607 354.958 – drb Nov 17 '11 at 16:42
  • Oh ok, I get you. To calculate the s.d. given the mean, follow eg [these instructions](http://www.mathsrevision.net/gcse/pages.php?page=42) (warning: Comic Sans :p) – AakashM Nov 17 '11 at 16:55
  • Here's a link to a page on google books concerning standard deviation in circular data and the von Mises distribution: http://books.google.com/books?id=wGPj3EoFdJwC&pg=PA54&lpg=PA54&dq=%22standard+deviation%22+circular+quantities&source=bl&ots=PiYoAyzGO6&sig=kgjr6mEz1znibEfW2-Xp94iAStY&hl=en&ei=XH3FTrPIJcObtwfP5pCmCg&sa=X&oi=book_result&ct=result&resnum=5&ved=0CDAQ6AEwBA#v=onepage&q=%22standard%20deviation%22%20circular%20quantities&f=false – drb Nov 18 '11 at 15:25

2 Answers2

14

The simple solution is to convert those angles to a set of vectors, from polar coordinates into cartesian coordinates.

Since you are working with colors, think of this as a conversion into the (a*,b*) plane. Then take the mean of those coordinates, and then revert back into polar form again. Done in matlab,

theta = [355,5,5,5,5];
x = cosd(theta); % cosine in terms of degrees
y = sind(theta); % sine with a degree argument

Now, take the mean of x and y, compute the angle, then convert back from radians to degrees.

meanangle = atan2(mean(y),mean(x))*180/pi
meanangle =
       3.0049

Of course, this solution is valid only for the mean angle. As you can see, it yields a consistent result with the mean of the angles directly, where I recognize that 355 degrees really wraps to -5 degrees.

mean([-5 5 5 5 5])
ans =
     3

To compute the standard deviation, it is simplest to do it as

std([-5 5 5 5 5])
ans =
       4.4721

Yes, that requires me to do the wrap explicitly.

  • Thank you. Yes the standard deviation calculations must always take into account the wrap, whether at 0, red, in degrees or at Pi and negative Pi, a blue green, in radians. Approaches for characterizing circular data other than std exist as well. See links above. – drb Nov 18 '11 at 15:23
0

I think the method proposed by user85109 is a good way to compute the mean, but not the standard deviation: imagine to have three angles: 180, 180, 181

the mean would be correctly computed, as a number aproximately equal to 180

but from [180,180,-179] you would compute a high variance when in fact it is near zero

At first glance, I would compute separately the means and variances for the half positive angles , [0 to 180] and fot the negative ones [0,-180] and later I would compute the combined variance https://www.emathzone.com/tutorials/basic-statistics/combined-variance.html

taking into account that the global mean and the difference between it and the local means has to be computed in both directions: clockwise and counterclockwise, and the the correct one has to be chosen.

FFolch
  • 1