43

I know that __builtin__ sorted() function works on any iterable. But can someone explain this huge (10x) performance difference between anylist.sort() vs sorted(anylist) ? Also, please point out if I am doing anything wrong with way this is measured.

"""
Example Output:
$ python list_sort_timeit.py 
Using sort method: 20.0662879944
Using sorted builin method: 259.009809017
"""

import random
import timeit

print 'Using sort method:',
x = min(timeit.Timer("test_list1.sort()","import random;test_list1=random.sample(xrange(1000),1000)").repeat())
print x

print 'Using sorted builin method:',
x =  min(timeit.Timer("sorted(test_list2)","import random;test_list2=random.sample(xrange(1000),1000)").repeat())
print x


As the title says, I was interested in comparing list.sort() vs sorted(list). The above snippet showed something interesting that, python's sort function behaves very well for already sorted data. As pointed out by Anurag, in the first case, the sort method is working on already sorted data and while in second sorted it is working on fresh piece to do work again and again.

So I wrote this one to test and yes, they are very close.

"""
Example Output:
$ python list_sort_timeit.py 
Using sort method: 19.0166599751
Using sorted builin method: 23.203567028
"""

import random
import timeit

print 'Using sort method:',
x = min(timeit.Timer("test_list1.sort()","import random;test_list1=random.sample(xrange(1000),1000);test_list1.sort()").repeat())
print x

print 'Using sorted builin method:',
x =  min(timeit.Timer("sorted(test_list2)","import random;test_list2=random.sample(xrange(1000),1000);test_list2.sort()").repeat())
print x

Oh, I see Alex Martelli with a response, as I was typing this one.. ( I shall leave the edit, as it might be useful).

Senthil Kumaran
  • 54,681
  • 14
  • 94
  • 131

3 Answers3

61

Your error in measurement is as follows: after your first call of test_list1.sort(), that list object IS sorted -- and Python's sort, aka timsort, is wickedly fast on already sorted lists!!! That's the most frequent error in using timeit -- inadvertently getting side effects and not accounting for them.

Here's a good set of measurements, using timeit from the command line as it's best used:

$ python -mtimeit -s'import random; x=range(1000); random.shuffle(x)' '
y=list(x); y.sort()'
1000 loops, best of 3: 452 usec per loop
$ python -mtimeit -s'import random; x=range(1000); random.shuffle(x)' '
x.sort()'
10000 loops, best of 3: 37.4 usec per loop
$ python -mtimeit -s'import random; x=range(1000); random.shuffle(x)' '
sorted(x)'
1000 loops, best of 3: 462 usec per loop

As you see, y.sort() and sorted(x) are neck and neck, but x.sort() thanks to the side effects gains over an order of magnitude's advantage -- just because of your measurement error, though: this tells you nothing about sort vs sorted per se! -)

Andrea Bergonzo
  • 3,983
  • 4
  • 19
  • 31
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 1
    Thanks Alex! Would you explain why in-place sorting takes a slightly lesser time than say a list-returned by sorted()? Daniel mentions about copying, but we have not copied anything yet, right? I assume that the sorted-list generated by sorted() is a in-memory operation which when returns, assigns a new list object. – Senthil Kumaran Sep 17 '09 at 11:32
  • 2
    @Senthil: The objects in the list shouldn't be copied, but the list itself is (e.g. after `y = sorted(x)` you have two copies of the list, `x`, and `y`). So at a minimum the `sorted()` call requires a `malloc(n * sizeof(PyObject*))`, and `n` pointers to be copied from the first list to the new list. My guess is that copying those `n` pointers makes up for the about 2% time difference that Alex's benchmarks show. – Daniel Pryden Sep 17 '09 at 14:21
  • @Daniel, yep, malloc itself plus the pointers'' copy. Just timeit a simple list shallow copy operation (I suggest doing so on the command line!-) to see how much time that accounts for. – Alex Martelli Sep 17 '09 at 15:04
  • I also compared list.sort() vs sorted() on an iterator, results appear to be equivalent: `python -mtimeit -s 'import random; [random.random() for x in xrange(1000)].sort()': 100000000 loops, best of 3: 0.00899 usec per loop` and `python -mtimeit -s 'import random; sorted(random.random() for x in xrange(1000))': 100000000 loops, best of 3: 0.00891 usec per loop` – dtheodor Oct 17 '14 at 11:24
  • @SenthilKumaran: There is no real difference in performance; the values are being `shuffle`-ed differently on each run (`random` is reseeded on each Python launch), so the data is ordered differently on each run (Python's TimSort exploits existing ordering, so different random inputs can make a meaningful difference). If you make it consistent (call `random.seed(1)` before `random.shuffle` in the setup so the `x` is the same on each run), the differences drop to ~1 usec, which appears to be due to minor implementation details in CPython C-level argument parsing and byte code dispatch. – ShadowRanger Oct 13 '16 at 19:09
  • @SenthilKumaran: Implementation detail, but the CPython 2.7 implementation of `sorted` is basically equivalent to `def sorted(iterable, *args, **kwargs):`, `retval = list(iterable); retval.sort(*args[1:4], **kwargs); return retval`, except it performs argument parsing twice (ignoring the results on success) so the error message doesn't confuse people by referencing `list.sort` when they called `sorted`; double parsing, use of a C level NUL terminated string to find the `list`'s `sort` method (instead of interned `str` at Python level), and `tuple` slicing account for the (trivial) difference. – ShadowRanger Oct 13 '16 at 19:16
  • @dtheodor: Your timings are timing the wrong thing; you did the creation and sorting in the setup code, so the actual timed loops were just testing how fast `pass` runs. That's why your "sorting" is taking 9 nanoseconds per loop; that's the cost to do nothing (affirmatively) in Python. – ShadowRanger Oct 13 '16 at 19:18
  • oops. fixed: list().sort(): `$ python -mtimeit -s 'import random' -s 'i = (random.random() for x in xrange(1000))' 'list(i).sort()' 1000000 loops, best of 3: 0.549 usec per loop` sorted(): `$ python -mtimeit -s 'import random' -s 'i = (random.random() for x in xrange(1000))' 'sorted(i)' 1000000 loops, best of 3: 0.603 usec per loop` – dtheodor Oct 27 '16 at 14:01
14

Because list.sort does in place sorting, so first time it sorts but next time you are sorting the sorted list.

e.g. try this and you will get same results in timeit case most of the time is spent is copying and sorted also does one more copy

import time
import random
test_list1=random.sample(xrange(1000),1000)
test_list2=random.sample(xrange(1000),1000)

s=time.time()
for i in range(100):
    test_list1.sort()
print time.time()-s

s=time.time()
for i in range(100):
    test_list2=sorted(test_list2)
print time.time()-s
Anurag Uniyal
  • 85,954
  • 40
  • 175
  • 219
  • 1
    Yes, you pointed it out properly. I did know about that inplace sorting of the [].sort(), but somehow kept looking if the algorithms implemnted by both had any difference (which was cropping up). It seems that the [].sort() works pretty good in an already sorted list. BTW, for many reasons[1], I would use timeit for showing example snippets than time.time. The results are same while using timeit for single execution. http://diveintopython.org/performance_tuning/timeit.html – Senthil Kumaran Sep 17 '09 at 06:38
12

Well, the .sort() method of lists sorts the list in place, while sorted() creates a new list. So if you have a large list, part of your performance difference will be due to copying.

Still, an order of magnitude difference seems larger than I'd expect. Perhaps list.sort() has some special-cased optimization that sorted() can't make use of. For example, since the list class already has an internal Py_Object*[] array of the right size, perhaps it can perform swaps more efficiently.

Edit: Alex and Anurag are right, the order of magnitude difference is due to you accidentally sorting an already-sorted list in your test case. However, as Alex's benchmarks show, list.sort() is about 2% faster than sorted(), which would make sense due to the copying overhead.

Daniel Pryden
  • 59,486
  • 16
  • 97
  • 135
  • That was a helpful pointer. I just started digging and trying to understand the reasons for difference in in-place sorting vs returning operation. I have commented on Alex's response. Thanks. – Senthil Kumaran Sep 17 '09 at 11:33