1

Possible Interview Question: How to Find All Overlapping Intervals => provide us a solution to find all the overlapping intervals. On top of this problem, imagine each interval has a weight. I am aiming to find those overlap intervals summed weight, when a new interval is inserted.

Condition: Newly inserted interval's end value is always larger than the previously inserted interval's end point, this will lead us to have already sorted end points.

When a new interval and its weight is inserted, all the overlapped intervals summed weight should be checked that does it exceeds the limit or not. For example when we insert [15, 70] 2, [15, 20] 's summed weight will be 130 and it should give an error since it exceed the limit=128, if not the newly inserted interval will be append to the list.

int limit = 128;
Inserted itervals in order:
order_come | start | end | weight
0            [10,    20]     32  
1            [15,    25]     32  
2            [5,     30]     32 
3            [30,    40]     64
4            [1,     50]     16
5            [1,     60]     16 
6            [15,    70]      2 <=should not append to the list.

enter image description here

Final overall summed weight view of the List after `[15, 70] 2` is inserted:
[60, 70, 2]     
[50, 60, 18]    
[40, 50, 34]    
[30, 40, 98]    
[25, 30, 66]    
[20, 25, 98]    
[15, 20, 130]  <= exceeds the limit=128, throw an error. 
[10, 15, 96]
[5, 10, 64]
[1, 5, 32]
[0, 0, 0]

Thank you for your valuable time and help.

Community
  • 1
  • 1
alper
  • 2,919
  • 9
  • 53
  • 102

2 Answers2

2

O(log n)-time inserts are doable with an augmented binary search tree. To store

order_come | start | end | weight
0            [10,    20]     32
1            [15,    25]     32
2            [5,     30]     32
3            [30,    40]     64
4            [1,     50]     16
5            [1,     60]     16

we have a tree shaped like

                   25
                 /    \
               /        \
             10          50
            /  \        /  \
           5    20    40    60
          /    /     /
         1   15    30         ,

where each number represents the interval from it to its successor. Associated with each tree node are two numbers. The first we call ∆weight, defined to be the weight of the node's interval minus the weight of the node's parent's interval, if extent (otherwise zero). The second we call ∆max, defined to be the maximum weight of an interval corresponding to a descendant of the node, minus the node's weight.

For the above example,

interval | tree node | total weight | ∆weight | ∆max
[1, 5)     1           32             -32       0
[5, 10)    5           64             -32       0
[10, 15)   10          96             32        32
[15, 20)   15          128            32        0
[20, 25)   20          96             0         32
[25, 30)   25          64             64        64
[30, 40)   30          96             64        0
[40, 50)   40          32             16        64
[50, 60)   50          16             -48       80
[60, ∞)    60          0              -16       0

Binary search tree operations almost invariably require rotations. When we rotate a tree like

    p          c
   / \        / \
  c   r  =>  l   p
 / \            / \
l   g          g   r

we modify

c.∆weight += p.∆weight
g.∆weight += c.∆weight
g.∆weight -= p.∆weight
p.∆weight -= c.∆weight
p.∆max = max(0, g.∆max + g.∆weight, r.∆max + r.∆weight)
c.∆max = max(0, l.∆max + l.∆weight, p.∆max + p.∆weight).

The point of the augmentations is as follows. To find the max weight in the tree, compute r.∆max + r.∆value where r is the root. To increase every weight in a subtree by a given quantity ∂, increase the subtree root's ∆weight by ∂. By changing O(log n) nodes with inclusion-exclusion, we can increase a whole interval. Together, these operations allow us to evaluate an insertion in time O(log n).

To find the total weight of an interval, search for that interval as normal while also adding up the ∆weight values of that interval's ancestors. For example, to find the weight of [15, 30], we look for 15, traversing 25 (∆weight = 64), 10 (∆weight = 32), 20 (∆weight = 0), and 15 (∆weight = 32), for a total weight of 64 + 32 + 0 + 32 = 128.

To find the maximum total weight along a hypothetical interval, we do a modified search something like this. Using another modified search, compute the greatest tree value less than or equal to start (predstart; let predstart = -∞ if start is all tree values are greater than start) and pass it to this maxtotalweight.

maxtotalweight(root, predstart, end):
    if root is nil:
        return -∞
    if end <= root.value:
        return maxtotalweight(root.leftchild, predstart, end) + root.∆weight
    if predstart > root.value:
        return maxtotalweight(root.rightchild, predstart, end) + root.∆weight
    lmtw = maxtotalweight1a(root.leftchild, predstart)
    rmtw = maxtotalweight1b(root.rightchild, end)
    return max(lmtw, 0, rmtw) + root.∆weight

maxtotalweight1a(root, predstart):
    if root is nil:
        return -∞
    if predstart > root.value:
        return maxtotalweight1a(root.rightchild, predstart) + root.∆weight
    lmtw = maxtotalweight1a(root.leftchild, predstart)
    return max(lmtw, 0, root.rightchild.∆max + root.rightchild.∆weight) + root.∆weight

maxtotalweight1b(root, end):
    if root is nil:
        return -∞
    if end <= root.value:
        return maxtotalweight1b(root.leftchild, end) + root.∆weight
    rmtw = maxtotalweight1b(root.rightchild, end)
    return max(root.leftchild.∆max + root.leftchild.∆weight, 0, rmtw) + root.∆weight

We assume that nil has ∆weight = 0 and ∆max = -∞. Sorry for all of the missing details.

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
  • Thank you very much @David Eisenstat. I have a small question, during insertion operation, is it possible to cover also all overlapped intervals summed weights in O(log n)? I think the algorithm you described is doing it but, I was not able figure out how we could obtain the `total weight` for the each `interval`. For example, how could we obtain, "total weight:128" for [15, 20) under O(log n), could you explain it with an small example as how the algorithm works when we try to insert `[start:15, end:70] weight:2` into the augmented BST? And what was the reason we rotated the tree? – alper Feb 16 '17 at 09:04
  • @Avatar Edited in the algorithm. The reason to rotate the tree is to get a worst-case O(log n) bound on operation times. – David Eisenstat Feb 16 '17 at 13:24
  • To double check, ex: to find the weight of [15,20], we look for 15, traversing 25(∆weight=64), 10(∆weight=32), 20(∆weight=0), and 15(∆weight=32), for a total weight of 64+32+0+32=128, right? I am sorry for asking too much questions, I did not mentioned in the original question but do you have any guidance related to how could we insert() when a new interval such as `[start:45,end:55] weight:2` arrived? that would be very helpful. Could we do it right after we call `maxtotalweight()`,or could we insert the interval while we are calculating `maxtotalweight()` for the efficiency? @David Eisenstat – alper Feb 16 '17 at 15:24
  • @Avatar You could definitely combine the search to find the insertion point with the search to compute the interval max. It would probably be profitable, but I'm not sure. – David Eisenstat Feb 16 '17 at 15:30
  • Go it, thank you, I guess insertion should not be that difficult. My main goal was, to insert an interval into the tree, if the sum of overlapped intervals would not exceed the limit. While I compute the internal max, I could also find the insertion point, and if it does not exceed the limit insertion could be done I guess. Insertion should be done with its calculated ∆weight | ∆max value, right?. @David Eisenstat – alper Feb 16 '17 at 15:57
  • After each insertion all the travelled interval's ∆weight | ∆max value should also be updated, or remain as it is? @David Eisenstat – alper Feb 16 '17 at 16:45
  • 1
    @Avatar Insertion happens at the leaves; you should accumulate ∆weight as you traverse leafward and then set the new leaf's ∆weight appropriately. ∆max should be updated for all ancestors of changed nodes, using the formula in the rotation logic. – David Eisenstat Feb 16 '17 at 17:13
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/135884/discussion-between-avatar-and-david-eisenstat). – alper Feb 16 '17 at 18:23
1

Using the terminology of original answer when you have

'1E 2E 3E ... (n-1)E nE'

end-points already sorted and your (n+1)st end-point is grater than all previous end-points you only need to find intervals with end-point value greater then (n+1)st start-point (greater or equal in case of closed intervals).

In other words - iterate over intervals starting from most-right end-point to the left until you reach the interval with end-point lesser or equal than (n+1)st start-point and keep track of sum of weights. Then check if the sum fits into the limit. Worst case time-complexity is O(n) when all previous intervals have end-point grater then (n+1)st start-point.

Community
  • 1
  • 1
MartinPtrl
  • 71
  • 2
  • 8