-3

I'm quite unfamiliar with Python and decided to practice some algorithms here is my implementation of the merge sort, I'm not too sure why I'm getting this error:

Traceback (most recent call last):
  File "<string>", line 44, in <module>
   
UnboundLocalError: local variable 'mid' referenced before assignment
a = [1,2,3,5,3,1,8,4,5,6]

def mergesort(x):
    n = len(x)
    if n > 1:
        mid = (n//2)
        left = x[:mid]
        right =x[mid:]
    for i in range(0, mid -1):
        left[i] = x[i]
    for i in range(mid, n-1):
        right[i - mid] = x[i]
    mergesort(left)
    mergesort(right)
    merge(left,right,x)
martineau
  • 119,623
  • 25
  • 170
  • 301
Omar Inuwa
  • 47
  • 5
  • 6
    don't use semicolons at the end of lines – Jean-François Fabre Dec 27 '21 at 20:10
  • All this comes down to an indentation problem. Correct indentation is crucial in Python. – trincot Dec 27 '21 at 20:11
  • Python can only tell that there's a code path in your function that might try to use `mid` before any value had ever been assigned to it — so just define it by assigning something to it (like `None`) at the very beginning and the error should go away. – martineau Dec 27 '21 at 21:07
  • merge is not defined in your code, could you show more ? – Malo Dec 27 '21 at 21:25
  • 1
    Does this answer your question? ["UnboundLocalError: local variable referenced before assignment" after an if statement](https://stackoverflow.com/questions/15367760/unboundlocalerror-local-variable-referenced-before-assignment-after-an-if-sta) – Tomerikoo Dec 30 '21 at 00:20

2 Answers2

0

The if-block only runs when n > 1, and mid is initialized in it.

When n <= 1, the if-block will not be executed, and so mid is not initialized. So, in any of the following lines, if you reference mid, the error will be thrown.

This is how I'd implement mergesort:

def mergesort(lst):
    n = len(lst)
    if n <= 1:
        return lst
    else:
        left = mergesort(lst[:n//2])
        right = mergesort(lst[n//2:])
        return merge(left, right)

def merge(l1, l2):
    i, j = 0, 0
    res = []
    while i<len(l1) and j<len(l2):
        a, b = l1[i], l2[j]
        if a < b:
            res.append(a)
            i += 1
        elif a > b:
            res.append(b)
            j += 1
        else:
            res.append(a)
            res.append(b)
            i += 1
            j += 1
    return res + l1[i:] + l2[j:]

I tweaked the algorithm for readability. Swap out the list slicing for a more efficient approach (e.g. indexing) for speed optimisation.

Ci Leong
  • 92
  • 11
  • how to you define merge ? – Malo Dec 27 '21 at 21:38
  • @Malo I added the `merge` function. – Ci Leong Dec 28 '21 at 13:35
  • I tested your solution, but it removes duplicate values. – Malo Dec 29 '21 at 11:30
  • @Malo Ah I must have missed it, in the else-block, instead of just appending `a`, I need to append `b` as well. Problem is the duplicates in the same list will be merged in, but the duplicates at the separate list (at specific position of the time it is checked) will not. Fixed. – Ci Leong Dec 29 '21 at 17:28
  • @Malo Your solution has worst case `O(N^2logN)` time. `pop(0)` takes `O(N)` time (see [here](https://stackoverflow.com/questions/195625/what-is-the-time-complexity-of-popping-elements-from-list-in-python/46136638)), where `N` is the length of the list you popped from, as the remaining elements have to be shifted to left. Mergesort is expected to be in `O(NlogN)` time, otherwise it is slower than selection, insertion sort. – Ci Leong Dec 30 '21 at 07:17
  • good point. I would need to use collections.deque (doubly linked list) to be O(1) for pop(-1) and pop(0). Moreover my code consumes the l1 and l2 lists. Which may not be good in some cases. – Malo Dec 30 '21 at 09:30
  • 1
    I timed both solution. Despite a better Big O, your solution is slower when I timed it. Results may be different depending on the initial list and number of timed iteration (100000 or 1000000). – Malo Dec 30 '21 at 09:47
  • When I run it once with a very large lists of 50000 randomized int elements, your optimized code becomes a bit faster. – Malo Dec 30 '21 at 09:59
  • @Malo Slicing slowed it down, a `for i in range` loop can be used instead to append the items, but this kills readability. – Ci Leong Dec 30 '21 at 10:08
0

A shorter way to to this based on @Bill Ong answer:

a = [1,2,3,5,3,1,8,4,5,6]

def merge(lst1, lst2):
## merges two sorted lists with same length +/-1
    res = []
    while lst1 and lst2:
        if lst1[0] < lst2[0]:
            res.append(lst1.pop(0))
        else:
            res.append(lst2.pop(0))
    return res+lst1+lst2


def mergesort(lst):
    n=len(lst)
    if n <= 1:
        return lst
    return merge(mergesort(lst[:n//2]), mergesort(lst[n//2:]))

print(mergesort(a))

Output is:

[1, 1, 2, 3, 3, 4, 5, 5, 6, 8]
Malo
  • 1,233
  • 1
  • 8
  • 25