Given a set of intervals, find the interval which has the maximum number of intersections (not the length of a particular intersection). So if input (1,6) (2,3) (4,11), (1,6) should be returned. Some suggest to use Interval Tree to get this done in O(nlogn), but I did not understand how to construct and use the Interval Tree after reading its wiki page. I believe it can be done by doing some sort of sorting and scanning algorithm. If Interval tree is the only option, please educate me how to construct/use one. Thanks.
-
You cannot hope to do this under `O(n)` without preprocessing, because you need to evaluate all intervals. – Vincent van der Weele Feb 23 '14 at 10:15
-
Sorry. Typo. Meant to type O(nlogn) – user3216886 Feb 23 '14 at 10:34
3 Answers
Don't use an interval tree. Create an event for each interval endpoint, so that each interval has a start event and a stop event. Process these events in order. (Order starts before stops if a measure-zero intersection counts as an intersection, or stops before starts otherwise.)
Initialize a map C from intervals to numbers. Initialize the start count s = 0 and the stop count t = 0. To process a start event for interval i, set s = s + 1 and then C(i) = -(t + 1). To process a stop event for interval i, set t = t + 1 and then C(i) = C(i) + s.
At the end, C maps intervals to their intersection counts. This algorithm is O(n log n) because of the sorting; that running time is optimal if the endpoints can be added and compared only, by fairly standard computational geometry techniques.

- 64,237
- 7
- 60
- 120
-
Well, that was fast. Well done. (Can't award the bounty until tomorrow.) – Sneftel Feb 25 '14 at 15:06
-
1For start event: `C(i)=-1-num_stop_events_so_far`, for stop event: `C(i)+=num_start_events_so_far`. Isn't it simpler? – Evgeny Kluev Feb 25 '14 at 20:10
-
-
Evgeny: Be precise! That's not correct. numIntersections[interval] = StartedEventsTill[end[interval]] - EndedEventsTill[start[interval]]-1 – cegprakash Feb 26 '14 at 09:43
-
@cegprakash Isn't that equivalent to what he wrote? The spacing isn't ideal. – David Eisenstat Feb 26 '14 at 12:53
-
His so_far part does not explain whether it's the start or the end point of the interval. – cegprakash Feb 26 '14 at 12:54
Note: David Eisenstat's algorithm has better performance than this one.
A simple plane-sweep algorithm will do this in O(nlogn + m*n)
, where m
is the maximum number of intersections with any single interval.
Sort the interval endpoints. Keep track of the active segments. When reaching a start point, increment the intersection counts of all active intervals, and set the new interval's intersection count to the number of active intervals (excluding itself). When reaching an end point, remove the interval from the active intervals.

- 40,271
- 12
- 71
- 104
-
I see. This is almost the same as what I have so far. But instead of sorting end points, I sort it as {(a1,start),(b1,end),(a2,start),(b2,end),...,(an,start),(bn,end)} (the first coordinate is a number and the second coordinate is a label). Sort the array in lexicographic order where start. I thought this is O(N^2) since when walking through the sorted list, we will need to "increment the intersection counts of all active intervals" which would require O(n) and result in total O(n^2). – user3216886 Feb 23 '14 at 10:42
-
-
But you make a good point about the running time. The actual running time is `O(nlogn + m*n)`, where `m` is the number of intersections with the output interval. The algorithm is also `O(n^2)`, but that's a looser bound, if there is no interval which intersects all the other intervals. – Sneftel Feb 23 '14 at 10:53
-
Incidentally, AFAICT an interval tree -- at least, using the conventional interval tree construction algorithm -- won't improve on this bound – Sneftel Feb 23 '14 at 10:54
-
So this is the best we can do? O(nlogn + m*n)? Because a simple brute force can do this in O(n^2) too... – user3216886 Feb 23 '14 at 10:59
-
I'm not sure whether it's the best we can do -- I can't even imagine how to proceed with a big-omega analysis. For sparsely distributed intervals, though, the `m` term may well be dominated by the `log n` term anyway. – Sneftel Feb 23 '14 at 11:02
-
Wait, duh. I'm being stupid. Hold on and I'll post a better algorithm. – Sneftel Feb 23 '14 at 11:03
-
Actually, there was a flaw in my idea for a better algorithm. Yeah, I think that might be the best we can do. – Sneftel Feb 23 '14 at 11:08
-
Cool. Good to know. Thanks a lot. Greatly appreciate it. Please let me know if a flawless solution comes along. :) – user3216886 Feb 23 '14 at 11:12
-
I'm curious myself. I'll put a bounty on this question when it's eligible. – Sneftel Feb 23 '14 at 11:15
A python
implementation of the David's approach,
def max_interval_count(intervals):
interval_sorted = []
for idx, interval in enumerate(intervals):
s, e = interval
interval_sorted.append([s, idx, 0]) # 0 for start
interval_sorted.append([e, idx, 1]) # 1 for end
interval_sorted.sort(key = lambda x: x[0])
print(interval_sorted)
number_of_starts = 0
number_of_ends = 0
overlap_count = {}
for event in interval_sorted:
_, idx, start_end = event
if start_end == 0: # start event
# subtract all the ending before it
overlap_count[idx] = - (number_of_ends)
number_of_starts += 1
else: # end event
overlap_count[idx] += (number_of_starts - 1) # -1 as we should not include the start from the same interval
number_of_ends += 1
print(overlap_count)
ans_idx = -1
max_over_count = 0
min_len_interval = 99999999999
for idx, overl_cnt in overlap_count.items():
if overl_cnt > max_over_count:
ans_idx = idx
max_over_count = overl_cnt
elif overl_cnt == max_over_count and overl_cnt > 0 and (intervals[idx][1] - intervals[idx][0] + 1) < min_len_interval:
min_len_interval = (intervals[idx][1] - intervals[idx][0] + 1)
ans_idx = idx
if ans_idx == -1:
return ans_idx
return intervals[ans_idx]
if __name__ == "__main__":
test_1 = [[1,5],[5,10],[5,5]]
test_2 = [[1,2],[3,5]]
test_3 = [(1,6), (2,3), (4,11)]
ans = max_interval_count(test_1)
print(ans)
print("---------")
ans = max_interval_count(test_2)
print(ans)
print("---------")
ans = max_interval_count(test_3)
print(ans)
print("---------")
[[1, 0, 0], [5, 0, 1], [5, 1, 0], [5, 2, 0], [5, 2, 1], [10, 1, 1]]
{0: 0, 1: 1, 2: 1}
[5, 5]
---------
[[1, 0, 0], [2, 0, 1], [3, 1, 0], [5, 1, 1]]
{0: 0, 1: 0}
-1
---------
[[1, 0, 0], [2, 1, 0], [3, 1, 1], [4, 2, 0], [6, 0, 1], [11, 2, 1]]
{0: 2, 1: 1, 2: 1}
(1, 6)
---------

- 10,298
- 4
- 33
- 60