5

I have this simple function that partitions a list and returns an index i in the list such that elements at indices less that i are smaller than list[i] and elements at indices greater than i are bigger.

def partition(arr):
    first_high = 0
    pivot = len(arr) - 1
    for i in range(len(arr)):
        if arr[i] < arr[pivot]:
            arr[first_high], arr[i] = arr[i], arr[first_high]
            first_high = first_high + 1

    arr[first_high], arr[pivot] = arr[pivot], arr[first_high]
    return first_high


if __name__ == "__main__":
    arr = [1, 5, 4, 6, 0, 3]
    pivot = partition(arr)
    print(pivot)

The runtime is substantially bigger with python 3.4 that python 2.7.6 on OS X:

time python3 partition.py
real 0m0.040s
user 0m0.027s
sys  0m0.010s

time python partition.py
real 0m0.031s
user 0m0.018s
sys  0m0.011s

Same thing on ubuntu 14.04 / virtual box

python3:

real 0m0.049s
user 0m0.034s
sys  0m0.015s

python:

real 0m0.044s
user 0m0.022s
sys  0m0.018s

Is python3 inherently slower that python2.7 or is there any specific optimizations to the code do make run as fast as on python2.7

Imaxd
  • 368
  • 2
  • 14
  • 9
    Use the `timeit` module to compare code execution, not `time`; Python startup time is overrepresented, as are random OS events such as disk flushes. – Martijn Pieters May 01 '14 at 15:32
  • I was trying your code when I realized, what is `lasthigh`? Out of the box that code will fail because `lasthigh` is not initialized? Please review. – Paulo Bu May 01 '14 at 15:44
  • @PauloBu you're right. there no lasthigh it is actually first_high. Fixed thanks. – Imaxd May 01 '14 at 15:57

1 Answers1

13

As mentioned in the comments, you should be benchmarking with timeit rather than with OS tools.

My guess is the range function is probably performing a little slower in Python 3. In Python 2 it simply returns a list, in Python 3 it returns a range which behave more or less like a generator. I did some benchmarking and this was the result, which may be a hint on what you're experiencing:

python -mtimeit "range(10)"
1000000 loops, best of 3: 0.474 usec per loop

python3 -mtimeit "range(10)"
1000000 loops, best of 3: 0.59 usec per loop

python -mtimeit "range(100)"
1000000 loops, best of 3: 1.1 usec per loop

python3 -mtimeit "range(100)"
1000000 loops, best of 3: 0.578 usec per loop

python -mtimeit "range(1000)"
100000 loops, best of 3: 11.6 usec per loop

python3 -mtimeit "range(1000)"
1000000 loops, best of 3: 0.66 usec per loop

As you can see, when input provided to range is small, it tends to be fast in Python 2. If the input grows, then Python 3's range behave better.

My suggestion: test the code for larger arrays, with a hundred or a thousand elements.

Actually, I went further and test a complete iteration through the elements. The results were totally in favor of Python 2:

python -mtimeit "for i in range(1000):pass"
10000 loops, best of 3: 31 usec per loop

python3 -mtimeit "for i in range(1000):pass"
10000 loops, best of 3: 45.3 usec per loop

python -mtimeit "for i in range(10000):pass"
1000 loops, best of 3: 330 usec per loop

python3 -mtimeit "for i in range(10000):pass"
1000 loops, best of 3: 480 usec per loop

My conclusion is that, is probably faster to iterate through a list than through a generator. Although the latter is definitely more efficient regarding memory consumption. This is a classic example of the trade off between speed and memory. Although the speed difference is not that big per se (less than miliseconds). So you should value this and what's better for your program.

Paulo Bu
  • 29,294
  • 6
  • 74
  • 73
  • I'm not at my computer so I can't check, but I think you should try your second benchmark with very large numbers (say, 10 million to a billion). I hypothesise that the upfront allocation that Python 2's `range` performs will kill the performance in the limiting case. – Benjamin Hodgson May 01 '14 at 18:00
  • Python 3 implementation seems to be optimized for very large ranges then! Thanks for the answer. – Imaxd May 01 '14 at 19:33
  • 5
    Python 3 `range()` should be compared with Python 2 `xrange()`. It does not generate the list first, it only iterates through the values. This way, it also does not consume more memory for bigger range. – pepr May 01 '14 at 21:10
  • Looping through the list may be optimized as the number of elements is known in advance. Actually, the loop need not to be stopped by the `StopIteration` exception -- this way the loop need not test whether exception happened (simpler code). But this is just my guess. – pepr May 01 '14 at 21:15
  • 2
    Using Python 2's range with a billion integer will explode the memory since it will need ~4GB to store the list. This is exactly when Python3's `range` or Python2's `xrange` come handy. The kind of tradeoff I mentioned in the answer. – Paulo Bu May 02 '14 at 15:17