2

I have been trying to solve this problem in an efficient way. The problem is:

Problem Statement

Given a list of tuples in the form [(start1, end1), (start2, end2), (start3, end3)....(startn, endn)] where start and end are positive integers. Each tuple is to represent a time window, for example: [(1, 3), (73, 80)...]. Find the time (integer) where max concurrency occurs and get the tuples where max concurrency occurs.

Constraints:

  1. start and end are integers of time and are between 0 to n
  2. For all cases start < end
  3. start is inclusive but end is exclusive
  4. For the time (integer) where maximum concurrency occurs, we can get only one if there are multiple cases

For example the schedule below will have max_concurrency at time 2 and the tuples are (0,3), (2,3), (1, 200) that have it.

schedule = [
            (0, 3),
            (3, 5),
            (2, 3),
            (6, 8),
            (10, 12),
            (73, 92),
            (1, 200),
            ]

My Code

For time at where maximum concurrency occurs. Correct me if I'm wrong but I think this runs in O(n^2) time.

from collections import defaultdict

schedule_dict = defaultdict(lambda: 0)

for start, end in schedule:
    for time in range(start, end):
            schedule_dict[time] += 1

max_concurrency = max(schedule_dict, key=schedule_dict.get)
print(f"Time where max concurrency happens is : {max_concurrency}")

Output

Time where max concurrency happens is : 2  

For the sessions where maximum concurrency occurs, I think this runs in O(n) time.

My Code

for start, end in schedule:
    if start <= max_concurrency < end:
        print(f"{(start, end)}")

Output

(0, 3)
(2, 3)
(1, 200)

Finally My Question

What is a more efficient way to do this to reduce time and space complexity?

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
Rahul P
  • 2,493
  • 2
  • 17
  • 31
  • @GuyCoder : Updated the wording too.Thank you again :) – Rahul P Feb 08 '20 at 14:25
  • 1
    @GuyCoder I'm just here to learn (while also trying to give back to the community). Thank you for both your suggestions on how to improve the question. – Rahul P Feb 08 '20 at 14:28
  • Related / duplicates : [1](https://stackoverflow.com/questions/53898941/the-most-efficient-way-to-find-the-point-where-maximum-number-of-intervals-overl) , [2](https://stackoverflow.com/questions/18365107/maximum-no-of-overlaps-of-all-time-intervals) , [3](https://stackoverflow.com/questions/35739542/given-a-set-of-intervals-how-to-find-the-maximum-number-of-intersections-among) [4](https://stackoverflow.com/questions/2244964/finding-number-of-overlaps-in-a-list-of-time-ranges) , [5](https://stackoverflow.com/questions/31522450/find-the-year-with-the-most-number-of-people-alive-in-python) – jrook Feb 08 '20 at 19:46

1 Answers1

1

The number of intervals overlapping any time instant T is the number of interval start times less than or equal to T, minus the number of interval end times less than or equal to T.

  1. Put the start times and end times in separate lists, and sort them.
  2. Initialize a depth counter to 0
  3. Walk through the lists in order (like merge sort), adding 1 for each start time, and subtracting 1 for each end time
  4. Remember when the counter reaches a maximum -- that's the time of maximum overlap.

Here's an implementation in python:

schedule = [
  (0, 3),  (3, 5), (2, 3), (6, 8),
  (10, 12), (73, 92), (1, 200),
  ]

starts = [x[0] for x in schedule]
ends = [x[1] for x in schedule]

starts.sort()
ends.sort()

endpos = 0
depth = 0
maxdepth = 0
maxdepthtime = -1

for time in starts:
    depth+=1
    while endpos < len(ends) and ends[endpos]<= time:
        depth -= 1
        endpos += 1
    if depth > maxdepth:
        maxdepth = depth
        maxdepthtime = time

overlappers = [x for x in schedule
    if (x[0] <= maxdepthtime and x[1] > maxdepthtime)]

print ("Max overlap at time: ", maxdepthtime, " depth ", maxdepth)
print ("Intervals: ", overlappers)
Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • 1
    Your idea works and is simple, but is it fast? If I had the time I would try this as a constraint satisfaction problem, e.g. [Understanding CLP(FD) Prolog code of N-queens problem](https://stackoverflow.com/a/53412988/1243762) – Guy Coder Feb 08 '20 at 14:35
  • It's as fast as sorting a list of numbers. There's absolutely nothing to be gained by making it complicated. – Matt Timmermans Feb 08 '20 at 14:41
  • 1
    `It's as fast as sorting a list of numbers.` Can you walk through the details for the answer given in the question. The way I understand your answer with the comment is that the list only holds the start and end numbers, but the way I think of how your idea works requires putting in all of the numbers in each range and also counting them. Either I am misreading something or you are missing something. A walk through would clarify it. Thanks. – Guy Coder Feb 08 '20 at 14:48
  • 2
    I added an implementation. Note that the whole thing is O(n) except for the `.sort()`, because `endpos` only increases – Matt Timmermans Feb 08 '20 at 15:02
  • I am not an expert on Big O, but how can it be O(n) when you have a while loop in a for loop, to me that is greater than O(n) and less than O(n squared). – Guy Coder Feb 08 '20 at 15:05
  • 1
    Every iteration of the inner loop increments `endpos`, and it never decreases. It can't be `> len(ends)`, so there are at most `len(ends)` iterations of the inner loop **all together** – Matt Timmermans Feb 08 '20 at 15:08
  • If the total range is limited, counts can be stored in an array indexing all numbers in the range, leading to overall O(n) time. – jrook Feb 08 '20 at 16:54