7

I'm looking for a fast solution for the following problem:

illustration of minimum distance problem

I have a fixed point (let's say the upper right on the white measurement line) and need to find the closest point on a curve made of equally spaced points (the lower curve). Additionally, I do this for every point on the upper curve to draw the distances between the curves with different colours (three levels: below minimum [red], between minimum and maximum [orange] and above maximum [green]).

My current solution is a tradeoff: I take the fixed point, iterate through an arbitrary interval (e. g. 50 units to the left and right of the fixed point) and calculate the distance of each pair. This saves some CPU power, but it is neither elegant nor accurate, since I could miss a minimum distance outside my chosen interval.

Any proposals for a faster algorithm?

Edit: Equally spaced means all points have the same distance on the x-axis, this is true for both curves. Also I do not need to interpolate between the points, this would be too time consuming.

nullp01nter
  • 522
  • 4
  • 18
  • It depends how the shape of the curve is defined. – Steve Jessop Oct 08 '12 at 10:03
  • The curve is the surface of an engraved material and it is sampled in fixed intervals. I don't know how to describe it better regarding your comment. – nullp01nter Oct 08 '12 at 10:08
  • Are points on *both* curves "equally spaced"? Does "equally spaced" mean at constant x-steps, constant diagonal steps, or what? – James Waldby - jwpat7 Oct 08 '12 at 10:14
  • @nullp01nter: So it's basically `point_n = (x_n, y_n)`, where `x_n = h*n` and `y_n` depending on the curve? – Zeta Oct 08 '12 at 10:15
  • @nullp01nter: fair enough, I wasn't sure what "equally spaced points" really meant. It's possible that the closest point of approach isn't one of the measured points on the curve, it's somewhere in between them. How you would find that depends how you interpolate between the measured points of the curve: linear; cubic spline; whatever. So you're already making a tradeoff of accuracy for speed by only comparing your point with the measured points of the curve. – Steve Jessop Oct 08 '12 at 10:23
  • 1
    @jwpat7: Points are equally spaced (constant x steps) on both curves. – nullp01nter Oct 08 '12 at 10:49
  • @SteveJessop: Right, but I do need to measure between discrete values, interpolation would be overkill. – nullp01nter Oct 08 '12 at 10:55
  • @Zeta: Yes, with h being a whole number. – nullp01nter Oct 08 '12 at 10:57
  • What prevents multiple points on the curve from being equally close? Worst case might be a perfect semi-circle, where every point on the curve is equally close to your one point. – Brendan Oct 08 '12 at 11:21
  • Theoretically, this can be the case. But even if it is, the minimum search terminates, it will simply output the first minimum point in my approach above. In practice, all point coordinates are quantized and a perfect semi-circle is not possible. – nullp01nter Oct 08 '12 at 14:20

3 Answers3

3

Rather than an arbitrary distance, you could perhaps iterate until "out of range".

In your example, suppose you start with the point on the upper curve at the top-right of your line. Then drop vertically downwards, you get a distance of (by my eye) about 200um.

Now you can move right from here testing points until the horizontal distance is 200um. Beyond that, it's impossible to get a distance less than 200um.

Moving left, the distance goes down until you find the 150um minimum, then starts rising again. Once you're 150um to the left of your upper point, again, it's impossible to beat the minimum you've found.

If you'd gone left first, you wouldn't have had to go so far right, so as an optimization either follow the direction in which the distance falls, or else work out from the middle in both directions at once.

I don't know how many um 50 units is, so this might be slower or faster than what you have. It does avoid the risk of missing a lower value, though.

Since you're doing lots of tests against the same set of points on the lower curve, you can proably improve on this by ignoring the fact that the points form a curve at all. Stick them all in a k-d tree or similar, and search that repeatedly. It's called a Nearest neighbor search.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • +1: As long as we cannot make other assumptions to the curve (y value changes only withing bounds) this is the fastest method. – Zeta Oct 08 '12 at 10:32
  • In particular we have no grounds to know whether the value of y exactly 149um to the right of our start point, is equal to the y-value of our top-right starting point. If so, then that is the closest point on the lower curve to our upper point, but the straight-line route to it actually passes through the upper curve. Which might "not count", I don't know. It might also be physically implausible -- maybe the material can't hold that tight a curve -- in which case we could stop searching sooner. – Steve Jessop Oct 08 '12 at 10:40
  • Thank you for your suggestions. I'll have to experiment with the k-d tree, this will take a while. – nullp01nter Oct 08 '12 at 11:18
3

It may help to identify this problem as a nearest neighbour search problem. That link includes a good discussion about the various algorithms that are used for this. If you are OK with using C++ rather than straight C, ANN looks like a good library for this.

It also looks as though this question has been asked before.

Community
  • 1
  • 1
chthonicdaemon
  • 19,180
  • 2
  • 52
  • 66
  • Thank you for the recommendation. Yes, I looked at the other questions too, but had the hope that my special case with those contiguous curves leaves room for some unusual optimization or shortcut. – nullp01nter Oct 08 '12 at 11:43
2

We can label the top curve y=t(x) and the bottom curve y=b(x). Label the closest-function x_b=c(x_t). We know that the closest-function is weakly monotone non-decreasing as two shortest paths never cross each other.

If you know that the distance function d(x_t,x_b) has only one local minimum for every fixed x_t (this happens if the curve is "smooth enough"), then you can save time by "walking" the curve:

- start with x_t=0, x_b=0
- while x_t <= x_max
-- find the closest x_b by local search
     (increment x_b while the distance is decreasing)
-- add {x_t, x_b} to the result set
-- increment x_t

If you expect x_b to be smooth enough, but you cannot assume that and you want an exact result,

Walk the curve in both directions. Where the results agree, they are correct. Where they disagree, run a complete search betwen the two results (the leftmost and the rightmost local maxima). Sample the "ambiguous block" in such an order (binary division) to allow the most pruning due to the monotonicity.

As a middle ground:

Walk the curve in both directions. If the results disagree, choose among the two. If you can guarantee at most two local maxima for each fixed x_t, this produces the optimal solution. There are still some pathological cases where the optimal solution is not found, and contain a local minimum that is flanked by two other local minima that are both worse than this one. I dare say it is uncommon to find a case where the solution is far from optimal (assuming smooth y=b(x)).

John Dvorak
  • 26,799
  • 13
  • 69
  • 83