There are two things going on.
First, Python intentionally limits recursion to a fixed depth. Unlike, say, Scheme, which will keep allocating frames for recursive calls until you run out of memory, Python (at least the most popular implementation, CPython) will only allocate sys.getrecursionlimit()
frames (defaulting to 1000) before failing. There are reasons for that,* but really, that isn't relevant here; just the fact that it does this is what you need to know about.
Second, as you may already know, while QuickSort is O(N log N)
with most lists, it has a worst case of O(N^2)
—in particular (using the standard pivot rules) with already-sorted lists. And when this happens, your stack depth can end up being O(N)
. So, if you have 1000 elements, arranged in worst-case order, and you're already one frame into the stack, you're going to overflow.
You can work around this in a few ways:
- Rewrite the code to be iterative, with an explicit stack, so you're only limited by heap memory instead of stack depth.
- Make sure to always recurse into the shorter side first, rather than the left side. This means that even in the
O(N^2)
case, your stack depth is still O(log N)
. But only if you've already done the previous step.**
- Use a random, median-of-three, or other pivot rule that makes common cases not like already-sorted worst-case. (Of course someone can still intentionally DoS your code; there's really no way to avoid that with quicksort.) The Wikipedia article has some discussion on this, and links to the classic Sedgewick and Knuth papers.
- Use a Python implementation with an unlimited stack.***
sys.setrecursionlimit(max(sys.getrecursionlimit(), len(myList)+CONSTANT))
. This way, you'll fail right off the bat for an obvious reason if you can't make enough space, and usually won't fail otherwise. (But you might—you could be starting the sort already 900 steps deep in the stack…) But this is a bad idea.****. Besides, you have to figure out the right CONSTANT
, which is impossible in general.*****
* Historically, the CPython interpreter recursively calls itself for recursive Python function calls. And the C stack is fixed in size; if you overrun the end, you could segfault, stomp all over heap memory, or all kinds of other problems. This could be changed—in fact, Stackless Python started off as basically just CPython with this change. But the core devs have intentionally chosen not to do so, in part because they don't want to encourage people to write deeply recursive code.
** Or if your language does automatic tail call elimination, but Python doesn't do that. But, as gnibbler points out, you can write a hybrid solution—recurse on the small end, then manually unwrap the tail recursion on the large end—that won't require an explicit stack.
*** Stackless and PyPy can both be configured this way.
**** For one thing, eventually you're going to crash the C stack.
***** The constant isn't really constant; it depends on how deep you already are in the stack (computable non-portably by walking sys._getframe()
up to the top) and how much slack you need for comparison functions, etc. (not computable at all, you just have to guess).