First I assume from your example that (0,1)
and (1,2)
overlap.
Btw, your example is incorrect, (3,7)
does not overlap with (0,2)
One way to solve this is:
- Sort the intervals based on the starting point
- Iterate from lowest starting point to highest point
2a. Count the number of previous endpoints larger or equal than current starting point.
2b. Add one to the count of current end point.
Step 1 can be done in O(n log n)
Step 2 is iterating over all the intervals while doing the count. So it's O(n * X)
where X
is the complexity to do the counting. With Fenwick Tree, we can do that in O(log n)
1, so step 2 can be completed in O(n log n)
also.
So the overall complexity is O(n log n)
.
Pseudocode:
def fenwick_tree.get(num):
return the sum from counter[0] to counter[num]
def fenwick_tree.add(num, val):
add one to counter[num]
intervals = [...]
sort(intervals) # using the starting point as the key
init_fenwick_tree()
sum = 0
count = 0
for (starting_point, end_point) in intervals:
sum = sum + (count - fenwick_tree.get(starting_point-1))
fenwick_tree.add(end_point,1)
return sum
Python code (only works when the interval points are non-negative integers):
MAX_VALUE = 2**20-1
f_arr = [0]*MAX_VALUE
def reset():
global f_arr, MAX_VALUE
f_arr[:] = [0]*MAX_VALUE
def update(idx,val):
global f_arr
while idx<MAX_VALUE:
f_arr[idx]+=val
idx += (idx & -idx)
def read(idx):
global f_arr
if idx <= 0:
return 0
result = 0
while idx > 0:
result += f_arr[idx]
idx -= (idx & -idx)
return result
intervals = [(1,4),(3,7),(5,8),(14,17),(0,2),(11,14)]
intervals = sorted(intervals, key=lambda x: x[0])
reset()
total = 0
for processed, interval in enumerate(intervals):
(start, end) = interval
total += processed - read(start-1)
update(end, 1)
print total
Will print 4
, which comes from these overlaps:
(0,2) - (1,4)
(1,4) - (3,7)
(3,7) - (5,8)
(11,14) - (14,17)
Note that we can't get the overlapping intervals, since in the worst case there will be O(n^2)
overlaps, which cannot be printed in O(n log n)
time.
1Actually Fenwick tree does the counting in O(log M)
time, where M
is the largest possible number of distinct values in the interval. Note that M <= 2n
since there can only be that many distinct values. So it's also correct to say that Fenwick Tree does the counting in O(log n)
time.