3

I'm completely new to Python and I'm stumped by the following.

As part of a crash course, I've written a quicksort function using list comprehensions, as follows:

data = [78,3,3526,-12244,9,2,8,6,-84,3642,1,-1234, 234, 23, -1, -11, 34]
pivot = data[0]

def quicksort(lst):
    if len(lst) <= 1:
        return lst
    lessThanPivot = [x for x in lst if x < pivot]
    greaterThanPivot = [x for x in lst if x >= pivot]
    return quicksort(lessThanPivot) + [pivot] + quicksort(greaterThanPivot)

print(quicksort(data))

This results in the following output:

Traceback (most recent call last):
File "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.1\visualstudio_py_util.py", line 106, in exec_file
exec_code(code, file, global_variables)
File "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\Extensions\Microsoft\Python Tools for Visual Studio\2.1\visualstudio_py_util.py", line 82, in exec_code
exec(code_obj, global_variables)
File "C:\Users\user\documents\visual studio 2013\Projects\PythonApplication1\PythonApplication1\quickSortExercise.py", line 12, in <module>
print(quicksort(data))
[...]
File "C:\Users\user\documents\visual studio 2013\Projects\PythonApplication1\PythonApplication1\quickSortExercise.py", line 3, in quicksort
def quicksort(lst):
RuntimeError: maximum recursion depth exceeded

Yet the following works fine:

data = [78,3,3526,-12244,9,2,8,6,-84,3642,1,-1234, 234, 23, -1, -11, 34]
pivot = data[0]

def quicksort(lst):
    if len(lst) <= 1:
        return lst
    lessThanPivot = [x for x in lst[1:] if x < pivot]
    greaterThanPivot = [x for x in lst[1:] if x >= pivot]
    return quicksort(lessThanPivot) + [pivot] + quicksort(greaterThanPivot)
print(quicksort(data))

The only difference is lst[1:] instead of lst

Documentation on list comprehensions did not seem to address this.

Blorbstein
  • 33
  • 4
  • 5
    "*The only difference is lst[1:] instead of lst"* Well... that's kind of a big deal. The first will call the function with smaller and smaller subsets of the list. The second will cause infinite recursion. – Cory Kramer Feb 06 '15 at 16:10
  • 1
    Note: to insert inline code use backticks. When you want to provide error messages you should 1) format them as code 2) Copy **all** the output, from the line `Traceback (most recent call last):` up to the line `ErrorName: message`, not just the last line. – Bakuriu Feb 06 '15 at 16:14
  • @Cyber - Can you please elaborate? I thought that lst would produce the same result, since lessThanPivot is inherently a smaller list than lst, and quicksort is called recursively on that. – Blorbstein Feb 06 '15 at 16:26
  • @Bakuriu - Thanks, though the entire output is exceptionally long and repetitive so I feel inclined to omit part of it in this case. – Blorbstein Feb 06 '15 at 16:27
  • @Blorbstein Read up on [Python slicing notation](http://stackoverflow.com/questions/509211/explain-pythons-slice-notation) then look at those two versions of `lst`. – Cory Kramer Feb 06 '15 at 16:27
  • 1
    @Blorbstein If you have the same line repeated 1000 times just put a few, then put an ellipsis like `[...]` and the last lines. the point is: maybe *you* aren't able to understand the information that they convey but they are extrmely useful for *us* to understand what went wrong, especially were you cannot provide the full code used. – Bakuriu Feb 06 '15 at 16:32
  • @Bakuriu - understood, I'll edit that in. – Blorbstein Feb 06 '15 at 16:36

2 Answers2

4

You are never changing your pivot in the first snippet, so the "less-than" partition of lessThanPivot is again lessThanPivot (i.e. an equivalent list) and the "greater-than" partition of greaterThanPivot is again greaterThanPivot, leading to an infinite recursion.

Your second snippet "works" because lst[1:] keeps trimming off the first element of the list, so the list gets smaller with each recurrence, eventually leading to the base case. The final answer is incorrect, however.

In short, pick a new pivot before each partition.

arshajii
  • 127,459
  • 24
  • 238
  • 287
2

somelist[1:] is using slice notation to produce a list which contains all of the elements after the first one. See this question for an explanation of slice notation.

The reason the first sort fails is because the list greaterThanPivot will always also contain the pivot, since the pivot is the first element and the condition being used is x >= pivot. Thus the recursion on greaterThanPivot always contains the same pivot, and never gets any smaller after the first partition. In your example, after the first step, greaterThanPivot = [78, 3526, 3642, 234], and if you trace this through the algorithm, you will see that it will always contain this value, until you run out of stack space.

In the second algorithm, the slice notation is being used to remove the first element of the list, that is, it removes the pivot before partitioning. This prevents the infinite recursion on the list containing the pivot.

Community
  • 1
  • 1
zstewart
  • 2,093
  • 12
  • 24