0

description:

  • I need merge two lists whose element is a range value for serval (at least 10000) times efficiently. All elements should merge if they can.
  • Two lists are sorted already.
  • Every elements are strictly separate, it means these case are illegal:
list1 = [[1, 3], [3, 6]]
list2 = [[1, 4], [2, 5]]

list1 = [[1, 5], [6, 10]]
list2 = [[2, 4], [5, 8]]

example:

#elements: 0:start(inclusive) 1:stop(inclusive).
#the `ans` become next `list1`, to merge a new `list2` .

#input1:
list1 = [[1, 1],[3, 3]]
list2 = [[5, 5]]
#output1:
ans = [[1, 1], [3, 3], [5, 5]]

#input2:
list1 = [[1, 1], [3, 3], [5, 5]]
list2 = [[2, 2]]
#output2:
ans = [[1, 3], [5, 5]] # [1,1]+[2,2]+[3,3] = [1,3]

#input3:
list1 = [[1, 3], [5, 5]]
list2 = [[0, 0], [4, 4], [6, 6]]
#output:
ans = [[0,6]] #[0,0]+[1,3]+[4,4]+[5,5]+[6,6] = [0,6]

what I have tried:

    def merge(list1,list2):
        ans = sorted(list1+list2,key = lambda x:x[0])
        idx = 0
        while idx<len(ans):
            try:
                if ans[idx][1] == ans[idx+1][0] - 1:
                    ans[idx] = [ans[idx][0],ans[idx+1][1]]
                    del ans[idx+1]
                elif ans[idx][0] == ans[idx+1][1] + 1:
                    ans[idx] = [ans[idx+1][0],ans[idx][1]]
                    del ans[idx+1]
                else:
                    idx+=1
            except Exception:
                idx+=1
        return ans
  • It works, but it is really slow. It tooks about 15 secs to solve a difficult case, 1.2 sec to solve a simple case.
  • Desired time is less than 3 secs to solve a difficult case.

question

  • Any better solution?
  • Or which algorithm should I use? Maybe segment tree?
leaf_yakitori
  • 2,232
  • 1
  • 9
  • 21

2 Answers2

3

You are not using all the information. The lists are sorted already, so concatenating them and resorting is unnecessary.

def merge(list1, list2):
    # helper to process one interval from a given list
    def consume(lst, index):
        start, end = interval = lst[index]
        if not result or result[-1][-1] < start - 1:
            result.append(interval)
        elif result[-1][-1] < end:
            result[-1][-1] = end
        return index + 1

    result = []
    i1 = i2 = 0
    while i1 < len(list1) and i2 < len(list2):
        # saving some boilerplate code to always work on list1
        if list1[i1] > list2[i2]:
            list1, i1, list2, i2 = list2, i2, list1, i1
        i1 = consume(list1, i1)
    while i2 < len(list2):
        i2 = consume(list2, i2)
    return result

list1 = [[1, 1],[3, 3]]
list2 = [[5, 5]]
merge(list1, list2)
# [[1, 1], [3, 3], [5, 5]]
list1 = [[1, 1], [3, 3], [5, 5]]
list2 = [[2, 2]]
merge(list1, list2)
# [[1, 3], [5, 5]]
list1 = [[1, 3], [5, 5]]
list2 = [[0, 0], [4, 4], [6, 6]]
merge(list1, list2)
# [[0, 6]]

The algorithmic advantages of this approach seem to be outweighed by the brute C-Power of Python sorting over the context changes (particularly with the nested function). One could try to improve the solution suggested by @trincot a little though (not design-wise, but from a pure time/space-efficiency standpoint):

def merge(list1, list2):
    # sorted(l1+l2)  creates 2 new lists in memory
    list1.extend(list2)
    list1.sort()
    result = list1[:1]
    for interval in list1:
        start, end = interval
        _, e = last = result[-1]
        if e < start - 1:
            result.append(interval)
        elif e < end:  
            last[-1] = end
    return result
user2390182
  • 72,016
  • 6
  • 67
  • 89
  • Sorry I try your solution but it still cost 15 secs to solve a difficult case, while the @trincot solution cost 5 secs to a same case. +1 for your effort. – leaf_yakitori Aug 27 '21 at 07:35
  • https://stackoverflow.com/questions/464342/combining-two-sorted-lists-in-python, it shows if elements less than 1000000, concatenatie and resort it maybe the more efficient solution. – leaf_yakitori Aug 27 '21 at 07:49
  • This was also my solution since the time complexity is just linear `O(n+m)`. Maybe the `sorted` functionality in Python is so fast that it beats all the necessary extra loops made here :) – Niel Godfrey Pablo Ponciano Aug 27 '21 at 07:50
  • 1
    Yes, `sorted` is implemented with compiled code, so for limited list sizes (maybe below a million), it will still be faster than looping "manually" in Python, even though there is a time complexity difference. – trincot Aug 27 '21 at 08:14
  • Also be careful: this solution will *mutate* some elements in the input lists. I would change it a bit so that the result is made up of new pairs. It can be a surprise for a caller to find out that their input lists were altered after making the `merge` call. – trincot Aug 27 '21 at 08:23
  • @trincot Of course in production code, mutating arguments would be a no go (unless it's the purpose of a function). This seemingly being a mostly academic challenge... should be fine :) – user2390182 Aug 27 '21 at 08:37
1

del will have an impact on performance... try to avoid it. I think it is better to let the answer list start as an empty list, and append ranges to it as you go.

Also, I don't think it is worth to provide a key argument to sorted: by default it will anyway sort by the first value in the pairs, and if there is a tie, it will sort by the second value.

Here is how it could work:

def merge(list1,list2):
    ans = []
    last = [None, -float("inf")]
    for start, end in sorted(list1+list2):
        if start > last[1] + 1:
            last = [start, end]
            ans.append(last)
        elif end > last[1]:  # Merge:
            last[1] = end  # We mutate a pair that is already in the result
    return ans
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Thanks! It does much quicker then mine, it tooks about 5.2 secs to solve a difficult case. I want to see more solution, so I'll accept your answer tomorrow if I don't get any better solution. – leaf_yakitori Aug 27 '21 at 06:59