1

I was recently asked in an interview about what is the best method for computing the distance between two points using floating-point arithmetic.

My naive answer was :

Assuming the two points are (a,b) and (c,d), the answer is

dist = sqrt( (a-c)*(a-c) + (b-d)*(b-d))

However, I now realize that if the points are very close, squaring first might cause an underflow. Is there a more accurate way to do this?

johngreen
  • 2,676
  • 5
  • 32
  • 47
  • 2
    Many languages have a math library that provides a [`hypot`](https://en.wikipedia.org/wiki/Hypot) method, so in those languages you'd simply do `hypot(a-c, b-d)`. Assuming that `hypot` is well implemented, that should avoid problems with intermediate overflow and underflow. Note that `hypot` is also a recommended IEEE 754 standard operation. Are the interviewers allowing you to use `hypot`, or are you required to roll your own? – Mark Dickinson Jul 19 '18 at 10:41
  • 2
    Relevant blog post: https://www.johndcook.com/blog/2010/06/02/whats-so-hard-about-finding-a-hypotenuse/, though I think there are more accurate methods. (On a typical machine, scaling by a power of two would be better than scaling by the larger of the two values.) – Mark Dickinson Jul 19 '18 at 10:43
  • 1
    @MarkDickinson It was a very open-ended question. No constraints were specified. However, the interview was for a scientific computing role, so I'm guessing they wanted to know if I had a good understanding of pitfalls of floating point arithmetic. – johngreen Jul 19 '18 at 11:49
  • This implementation of `hypot` in Netlib also seems to be a well-though solution : http://www.netlib.org/fdlibm/e_hypot.c – johngreen Jul 19 '18 at 12:07
  • If you are comparing many triangles and are only looking to rank the hypotenuse lengths, then the (expensive) sqrt operation is not needed. – rajah9 Jul 19 '18 at 12:17
  • @EricPostpischil : I have improved the title. My question is looking for increased accuracy in computation of hypotenuse. – johngreen Jul 20 '18 at 05:59
  • Some thoughts: If `a` and `c` are close enough, Sterbenz’s Lemma makes `a-c` exact. If not, then `a-c` may have some error, but it will not be large. Then, aside from the range issue as noted in comments, `(a-c)*(a-c) + (b-d)*(b-d)` is well behaved numerically. And since `sqrt` has low sensitivity to small errors (consider its derivative), the straightforward calculation of `(a-c)*(a-c) + (b-d)*(b-d)` generally gives a good result. Going to abnormal effort for more accuracy will only make a slight improvement. This makes me think the range issue is all the job interview question was seeking. – Eric Postpischil Jul 21 '18 at 00:11
  • Another method is shown in one of my old answers: https://stackoverflow.com/a/3506633/179910. It not only avoids over/underflow, but is quite fast (depending on the precision you need). You might find some of the other answers there interesting/useful as well. – Jerry Coffin Jul 21 '18 at 03:52

1 Answers1

3

As Mark Dickinson mentioned in the comment, math libraries tend to provide a hypot function.

The wikipedia article actually addresses the overflow/underflow issue.

https://en.wikipedia.org/wiki/Hypot

From the article:

The difficulty with the naive implementation is that x2 or y2 may overflow or underflow, unless the intermediate result is computed with extended precision. A common implementation technique is to exchange the values, if necessary, so that |x| ≥ |y|, and then use the equivalent form:

enter image description here

The computation of y/x cannot overflow. If y/x underflows, the final result is equal to |x|, which is correct within the precision of the calculation. The square root is computed of a value between 1 and 2. Finally, the multiplication by |x| cannot underflow, and overflows only when the result is too large to represent.

Community
  • 1
  • 1
Pedrom
  • 3,823
  • 23
  • 26