11

I have a list of latitude and longitude values, and I'm trying to find the distance between them. Using a standard great circle method, I need to find:

acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1))

And multiply this by the radius of earth, in the units I am using. This is valid as long as the values we take the acos of are in the range [-1,1]. If they are even slightly outside of this range, it will return NaN, even if the difference is due to rounding.

The issue I have is that sometimes, when two lat/long values are identical, this gives me an NaN error. Not always, even for the same pair of numbers, but always the same ones in a list. For instance, I have a person stopped on a road in the desert:

Time  |lat     |long
1:00PM|35.08646|-117.5023
1:01PM|35.08646|-117.5023
1:02PM|35.08646|-117.5023
1:03PM|35.08646|-117.5023
1:04PM|35.08646|-117.5023

When I calculate the distance between the consecutive points, the third value, for instance, will always be NaN, even though the others are not. This seems to be a weird bug with R rounding.

Praveen Kumar Purushothaman
  • 164,888
  • 24
  • 203
  • 252
David Manheim
  • 2,553
  • 2
  • 27
  • 42
  • The function works with vectors; I would use, for instance, dist(lat(1:5), long(1:5),lat(2:6), long(2:6)) as inputs where the function dist(lat1, long1, lat2, long2). – David Manheim Dec 24 '12 at 23:24
  • See [this question](http://stackoverflow.com/questions/5725101/java-trigonometry-and-double-inaccuracy-causing-nan), [this question](http://stackoverflow.com/questions/120283/working-with-latitude-longitude-values-in-java), and [Wikipedia](http://en.wikipedia.org/wiki/Great-circle_distance). – Eric Postpischil Dec 24 '12 at 23:41
  • 1
    @EricPostpischil Different language, and no solution that works in R is provided. – David Manheim Jan 16 '13 at 05:49

3 Answers3

9

Can't tell exactly without seeing your data (try dput), but this is mostly likely a consequence of FAQ 7.31.

(x1 <- 1)
## [1] 1
(x2 <- 1+1e-16)
## [1] 1
(x3 <- 1+1e-8)
## [1] 1
acos(x1)
## [1] 0
acos(x2)
## [1] 0
acos(x3)
## [1] NaN

That is, even if your values are so similar that their printed representations are the same, they may still differ: some will be within .Machine$double.eps and others won't ...

One way to make sure the input values are bounded by [-1,1] is to use pmax and pmin: acos(pmin(pmax(x,-1.0),1.0))

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • But in the source data, they are EXACTLY the same. The precision loss is introduced entirely within R - I checked. (Even if not, it's mathematically impossible for the expression to be greater than 1.) – David Manheim Dec 24 '12 at 23:18
  • Also, if there a workaround? I need to get it to work, and my solution unvectorizes it... – David Manheim Dec 24 '12 at 23:20
  • 1
    We would need to see the exact workflow in order to figure out what happened. If the numbers are identical in (e.g.) a CSV file you read in, then they will indeed be identical thereafter, but any arithmetic operation could induce the differences -- did you read the FAQ carefully yet? – Ben Bolker Dec 24 '12 at 23:27
  • I read them in from a dbf, but my assumption was that even if the precision loss was possible, it would be the same output for identical values. – David Manheim Dec 24 '12 at 23:28
  • 1
    You should indeed get identical output for identical input values. If you can show us a *reproducible* example of the divergence (see http://tinyurl.com/reproducible-000 ), then we can explain what's happening ... – Ben Bolker Dec 24 '12 at 23:30
  • If the data I was analyzing wasn't personally identifiable, I would submit it. Maybe I can find another example after the holiday. Thanks! – David Manheim Dec 24 '12 at 23:32
  • See the link in my comment , and http://stackoverflow.com/questions/10454973/how-to-create-example-data-set-from-private-data-replacing-variable-names-and-l ... maybe those exact techniques aren't useful for you, but there are a variety of tactics for anonymizing data ... – Ben Bolker Dec 24 '12 at 23:35
1

A simple workaround is to use pmin(), like this:

acos(pmin(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1),1))

It now ensures that the precision loss leads to a value no higher than exactly 1.

This doesn't explain what is happening, however.

(Edit: Matthew Lundberg pointed out I need to use pmin to get it tow work with vectorized inputs. This fixes the problem with getting it to work, but I'm still not sure why it is rounding incorrectly.)

David Manheim
  • 2,553
  • 2
  • 27
  • 42
1

I just encountered this. This is caused by input larger than 1. Due to the computational error, my inner product between unit norms becomes a bit larger than 1 (like 1+0.00001). And acos() can only deal with [-1,1]. So, we can clamp the upper bound to exactly 1 to solve the problem.

For numpy: np.clip(your_input, -1, 1)

For Pytorch: torch.clamp(your_input, -1, 1)

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Wey Shi
  • 1,552
  • 1
  • 9
  • 15
  • a little off-topic since the original post is tagged [r], but it will presumably be useful so I'm not going to down-vote ... – Ben Bolker Apr 07 '22 at 16:08