What's wrong with my mergesort implementation?

Question

On smaller lists of up to size N = ~1950 or so, I get the correct output... however, list sizes that are not much larger return an error rather than the expected result. My code:

def merge(left, right, aux=[]):
    if left == []:
        for x in right:
            aux.append(x)
        return aux
    elif right == []:
        for x in left:
            aux.append(x)
        return aux
    elif left[0] > right[0]:
        aux.append(right[0])
        return merge(left, right[1:], aux)
    else:
        aux.append(left[0])
        return merge(left[1:], right, aux)

def msort(values):
    size = len(values)

    if size == 1:
        return values
    else:
        mid = size/2
        return merge(msort(values[:mid]), msort(values[mid:]), [])

Running msort on these test lists give me the expected (ascending order) output:

val = [x for x in range(1950, 0, -1)
val = [x for x in range(4,0,-1)]

e.g. [1,2,3,...,1948,1949,1950] and [1,2,3,4]

However, when I run msort on this test case:

val = [x for x in range(2000,0,-1)]

I now receive this error (after numerous tracebacks to merge):

RuntimeError: maximum recursion depth exceeded in cmp

So my question is: What went wrong with my implementation here? I can't use it with lists of N ~ >=2000. Why?

What about `if size <= 1:`? And `list.extend(other_list)` to append a list. — User, Apr 26 '13 at 13:15
`[]` cases are account for in merge, and I do not account for passing empty cases directly into `msort`. — eazar001, Apr 26 '13 at 13:17
The Problen I think is the recursive merge function. It will call itself 1000 times recursively for 2000 elements. 1000 is the maximum recursion limit set in module `sys`. Use a for loop instead. — User, Apr 26 '13 at 13:19
It's wrong when called multiple times. See [“Least Astonishment” in Python: The Mutable Default Argument](http://stackoverflow.com/q/1132941/395760) — , Apr 26 '13 at 13:20
@delnan: I thought that at first too, but ISTM `merge` is always called with a value for the `aux` argument. It's still dangerous, but I don't think it's actually causing problems here. — DSM, Apr 26 '13 at 13:22
That's so not merge sort... In merge sort you divide the list in two *halves*, not in head and tail! — Bakuriu, Apr 26 '13 at 13:23
@Bakuriu, what do you mean by head and tail? Isn't that what `msort` is doing? — eazar001, Apr 26 '13 at 13:26

mgilson · Answer 1 · 2013-04-26T13:50:56.740

4

The problem is that you're doing the merge recursively. As pointed out, it's fine to call merge(msort(left),msort(right)), but since your merge function actually calls itself to do the merging, you're hitting the recursion limit.

Consider calling your merge function on lists of length 1000.

To merge those lists, you need 2000 calls to merge (since you only add ~1 element to aux with each call).

edited Apr 26 '13 at 13:50

answered Apr 26 '13 at 13:29

mgilson

300,191
65
633
696

Heh, I guess the answer was glaring right at me! [=, thanks for your input! – eazar001 Apr 26 '13 at 13:46
After doing some reasearch here on SO, and on other websites, it appears my main problem was my misunderstanding of python. Specifically, I was under the impression that each predecessor call to merge would be treated trivially in the sense that they would get cleared from the stack through each progression. But after a little investigation, it appears that python doesn't optimize for tail recursion, without explicitly coding this kind of behavior. Incidentally, it appears you have addressed this type of question in a past post. Perhaps, I should have worded my question differently. – eazar001 Apr 28 '13 at 03:14

User · Accepted Answer · 2013-04-26T13:42:13.510

2

Your merge function uses recursion that has a limit.

If you iterate instead of recurring you circumvent this:

def merge(left, right, aux=[]):
    while left and right:
        if left[0] > right[0]:
            aux.append(right.pop(0))
        else:
            aux.append(left.pop(0))
    aux.extend(right)
    aux.extend(left)
    return aux

This is an example of the usage:

>>> merge([1,3,5,7], [3,4,5,6])
[1, 3, 3, 4, 5, 5, 6, 7]

edited Apr 26 '13 at 13:42

answered Apr 26 '13 at 13:27

User

14,131
2
40
59

oh the NONE return was a typo on my part, your code works fine, currently testing on larger list... – eazar001 Apr 26 '13 at 13:43
Yea... I guess the recursive calls were causing some nasty problems for me, ouch.... thanks for your reply. – eazar001 Apr 26 '13 at 13:45

score 0 · Answer 3 · answered Apr 26 '13 at 13:20

0

You reached the maximum recursion depth.

This means, that your function msort calls itself more often, than python allows this by default.

You can increase this default value with sys.setrecursionlimit.

answered Apr 26 '13 at 13:20

Stefan

2,460
1
17
33

1

Though increasing it carries the risk of crashing the interpreter (killing the process and potentially corrupting memory, not raising a Python exception). – Apr 26 '13 at 13:21
1

The problem is not the recursion limit. A correct merge sort implementation does `logn` recursive calls, hence that error should *never* happen since python lists can have at most 500 million items(and the logn of that is less than 32). – Bakuriu Apr 26 '13 at 13:21
On my system, the recursionlimit (obtained with `sys.getrecursionlimit()`) is `1000`. In case of a bad implementation, this is not enough. – Stefan Apr 26 '13 at 13:24
@Bakuriu Please note, that the OP's `merge` is also a recursive function. – Stefan Apr 26 '13 at 13:28
increasing sys.setrecursionlimit only delays the problem, it doesn't solve it. – Lennart Regebro Apr 26 '13 at 13:33
@Bakuriu Wrong, 2n - 1 is the worst case number of calls – jamylak Apr 26 '13 at 14:42
@jamylak By "number of recursive calls" I meant the height of the recursion tree, and since each call halves the size it's *definitely* `logn`. – Bakuriu Apr 26 '13 at 15:00
@Bakuriu Right I misinterpreted you – jamylak Apr 26 '13 at 15:05

score 0 · Answer 4 · answered Apr 26 '13 at 13:20

0

Python has a limit for recursion. You can set the maximum (be careful ) using:

import sys
sys.setrecursionlimit(MAXRECURSION)

I was able to run your list

val = [x for x in range(2000,0,-1)]

with a MAXRECURSION of 2500

answered Apr 26 '13 at 13:20

jujaro

447
2
4

And for a list that's 4000 items? – Lennart Regebro Apr 26 '13 at 13:34

What's wrong with my mergesort implementation?

4 Answers4