1

Consider a large set of floating-point intervals in 1-dimension, e.g.

 [1.0, 2.5],                1.0 |---------------|2.5

 [1.5, 3.6],                      1.5|---------------------|3.6

 .....

It is desired to find all intervals that contain a given point. For example given point = 1.2, algorithm should return the first interval, and if given point = 2.0, it should return the first two interval in the above example.

In the problem I am dealing, this operation needs to be repeated for a large number of times for a large number of intervals. Therefore a brute-force search is not desired and performance is an important factor.

After searching about it, I saw this problem is addressed using interval skip list in the context of computational geometry. I was wondering if there is any simple, efficient C++ implementation available.


EDIT: To be more precise about the problem, there are N intervals and for M points, it should be determined which intervals contain each point. N and M are large numbers where M is larger than N.

ramino
  • 488
  • 4
  • 12
  • why skip lists? Skip lists are used as an efficient data structure mapping a key to a value; you just have a search problem. I think you want an [interval tree](https://en.wikipedia.org/wiki/Interval_tree) instead. – Jason S Feb 20 '15 at 22:44
  • I can't see where skip lists are needed here. Testing whether a point is in an interval is basically 2 comparisons, so if you have N intervals you wil need 2N comparisons. This will be a simple O(N) algorithm, and since you average PC can do a few million comparisons per second you should have your answer in the blink of an eye. – kuroi neko Feb 20 '15 at 22:48
  • kuroi: it's faster than O(N) if you have indexed data structures – Jason S Feb 20 '15 at 22:49
  • Of course, but it would be justified only for a few dozen million matches or more, which might well not be needed. And if there are many intervals, the cost of specialized data structure construction might not be worth the savings in actual search. – kuroi neko Feb 20 '15 at 22:51
  • @SaniHuttunen: number of points is more than number of intervals. – ramino Feb 21 '15 at 00:01
  • @kuroineko: you suggest a brute-force search based on simplicity of comparison operation, but as I explained a brute-force search is slow, since it needs to be repeated for a large number of points (number of points are more than intervals). – ramino Feb 21 '15 at 00:14
  • Not really. I suggested that you should define which was more numerous, intervals or points. Now that you edited your post, the proposed answer seems like a possible solution. – kuroi neko Feb 21 '15 at 00:22
  • @ramino: 2 > 1... We still need a number on both... – Sani Huttunen Feb 21 '15 at 01:12
  • @SaniHuttunen, it is hard to give a figure. The number of points in my problem equals to the number of elements in a 2D mesh used in FEM. It could vary depending on the mesh resolution, but I'd say typically not more 500k. Number of segments typically would not exceed 100k. – ramino Feb 21 '15 at 01:19
  • @ramino Then your question is asked in the wrong forum... – Sani Huttunen Feb 21 '15 at 01:49
  • @SaniHuttunen no, your persistence to answer here is misplaced. – ramino Feb 21 '15 at 02:19

2 Answers2

1

Suggest using CGAL range trees:

Wikipedia says interval trees (1-dimensional range trees) can "efficiently find all intervals that overlap with any given interval or point".

Jason S
  • 184,598
  • 164
  • 608
  • 970
  • The construction time is in n log(n), so if you have a high enough number of intervals relative to the number of points to test, this can become worse than brute force.The original question does not define the requirements precisely enough to pick an appropriate algorithm, IMHO. – kuroi neko Feb 20 '15 at 23:02
  • @kuroineko The OP explicitly said that he wants something other than brute force search and that the searches will be done many times. – jschultz410 Feb 20 '15 at 23:36
  • ...on a large number of intervals. As the first comment states, it boils down to definining which large is bigger than the other many. – kuroi neko Feb 20 '15 at 23:37
  • @Jason S, as you mentioned, interval trees was the first option I came across. I tired interval_map and continuous_interval in boost. But it does not give me the list of intervals containing a given point. Then I came across with this in CGAL: http://doc.cgal.org/latest/Interval_skip_list/group__PkgIntervalSkipList.html. Yet I was hoping for a simple and neat solution similar to https://github.com/ekg/intervaltree – ramino Feb 21 '15 at 00:09
  • @Jason S: following your suggestion, after switching back and forth between interval skip list and interval trees, I found this implementation (http://stackoverflow.com/a/21327959/1911163) which finds all floating-point intervals containing a given point in a short self-contained code. – ramino Feb 23 '15 at 22:21
1

If your distribution of intervals allows it, it may be worth to consider a gridding approach: choose some grid size s and create an array of lists. Every k-th list enumerates the intervals that overlap with the "cell" [k.s, (k+1).s[.

Then a query amounts to finding the cell that contains the query point (in O(1)) and reporting all intervals in the list that effectively contain it (in O(K)).

Both preprocessing time and storage are O(I.L+G) where I is the number of intervals and L the average interval length in terms of the grid size and G the total number of grid cells. s must be chosen carefully.