3

Given a set of ordered points, and a path made up of ordered lat,lon points that goes near those points (in lat/lon coordinates), I want to associate the points with the path, ideally with good algorithmic complexity (n*log(n)) or better, but maybe that might not be realistic.

The below diagram illustrates my question better. The blue line is the ordered path that is provided, and the red points are in the same order as the blue line. The green path is my desired result, which merges the red points and the blue line into a new ordered path.

Diagram explaining question

Some threshold would have to be set for the distance of the red points from the blue path, let's assume that the red points are at most 50 meters from the blue path.

So, this is definitely the most mathmatical and unusual question I have asked on Stack Overflow. Any ideas would be great on solving this. I am planning to use it to merge GTFS shape data with trip data which describes stop times, and build it into the open source project, Depart App.

Thanks for your help!

Marvin Pinto
  • 30,138
  • 7
  • 37
  • 54
Russell
  • 368
  • 2
  • 4
  • 17
  • Do I understand this correctly; you want the green path, which is the path connecting the ordered set containing the red points and the nodes of the blue path? – carlpett Jul 07 '11 at 05:23
  • That is correct, and the green line has to have all the red points and blue points in order. – Russell Jul 08 '11 at 04:58

4 Answers4

2

It seems to me, you have two sets of points, the lat/lon points, and the red points. One of the lat/lon points is your starting point. Now considering all the other points as a set, use an (approximate?) nearest neighbor algorithm to find the next point. Now repeat. The only trouble is, that nearest neighbor algorithms tend to be O(n), which makes what you want to do O(n^2).

andrewdski
  • 5,255
  • 2
  • 20
  • 20
  • This seems like a good, simple idea. One important thing to note is that the resulting ordered set of points has to have all the blue points in order and all the red points in order (inter-mingled). So if the bus route went down a street, then came back up it on the same trip (unusual, but very possible), the nearest neighbour algorithm would fail, as the resultant points would be out of order. However, perhaps by just comparing the next two points from each ordered set, then a good decision could be made. – Russell Jul 07 '11 at 05:36
  • 2
    A modification of this idea works great. Take the shape and the stop points, assume one is the first point in the line. Then look at the distance between the next shape point and next stop point, and pick the closer one. Repeat n times for a log(n) operation! I bet there are going to be some issues with this simple approach, but so far, it seems to be fine. – Russell Jul 07 '11 at 06:45
  • That's not a minor modification, that's a major improvement. You could post this as your own answer - I'd vote for it. (since that's what I was about to suggest.) – AShelly Jul 07 '11 at 15:23
  • But isn't that O(N), not lg(N)? – AShelly Jul 07 '11 at 15:26
  • But looking at your picture, what if I rotate the first blue leg, and first two red points counter clockwise around the around the lat/lon point between red point #2 and #3. It seems to me I could move red points #2 and #3 closer together so that you go direct from #2 to #3. Now depending on the length of blue leg #2, you may never be closer to the next blue point. Is that ok? (And actually, my solution might skip a blue point too, but it can recover and start using them again. – andrewdski Jul 08 '11 at 04:43
  • Good point, and I was worried about that, though fortunately with Translink's data set, this doesn't seem to be a problem. The thing is I really don't want to miss any stops (missing a shape point is ok though). So I would rather have it fail than skip points. I might have to add some intelligence in this respect though, perhaps by looking at several upcoming stops (red points) instead just one. – Russell Jul 08 '11 at 04:57
2

Building on the other suggestions provided here, I think I have found a O(n) algorithm that works effectively.

The idea is to first pick the first red point as the starting point (or could choose the first blue point). Then compare the distance from this point to the next red point and the next blue point, pick the closer. Repeat until both lists are depleted. This seems to be quite effective on the Translink data set. I will give an update if I make tweaks to this idea.

Russell
  • 368
  • 2
  • 4
  • 17
1

For each red point iterate through blue segments and find the best segment for it. What exactly "best" is depends on the task, but it looks like a good measure would be "how much longer path becomes". If A is a red point and BC is a segment, then the best segment will minimize this: f(A,BC)=(|BA|+|AC|-|BC|).

When the best segment is found, it should be replaced with BA and AC. In the same way other points should be processed.

Without optimizations this will give O(N^2).

If points are distributed more or less uniformly and segment length is significantly less than size of entire figure, some space partitioning may help.

maxim1000
  • 6,297
  • 1
  • 23
  • 19
  • Good answer, though I don't completely understand what you mean by embedding the point, your idea seems pretty much the same as the first answerer, though the space partitioning is a good idea. – Russell Jul 07 '11 at 05:38
  • By embedding point A I meant replacing the closest segment BC with two segments: BA and AC. Probably I misunderstood the first answer, because I thought he suggests to search for nearest points instead of segments... – maxim1000 Jul 07 '11 at 07:34
  • No, actually, I meant points. (And amusingly, when I read yours I misunderstood and thought you meant points!) – andrewdski Jul 08 '11 at 04:37
  • After some thinking I found that probably neither distance point-segment nor distance point-point is not a good measure. So I've a bit changed the algorithm. – maxim1000 Jul 08 '11 at 14:19
1

Perhaps you could use a custom sweep line algorithm, this should reduce the complexity of finding the nearest line segment.

Igor Brejc
  • 18,714
  • 13
  • 76
  • 95
  • Yes, this seems like a good idea as well. I think I may have to do something like this (limiting the search area), but for now, just looking at the next in the list and picking the closer works. Btw Maperitive is great! – Russell Jul 08 '11 at 03:19