132

The documentation doesn't guarantee that. Is there any other place that it is documented?

I'm guessing it might be stable since the sort method on lists is guaranteed to be stable (Notes 9th point: "Starting with Python 2.3, the sort() method is guaranteed to be stable"), and sorted is functionally similar. However, I'm not able to find any definitive source that says so.

Purpose: I need to sort based on a primary key and also a secondary key in cases where the primary key is equal in both records. If sorted() is guaranteed to be stable, I can sort on the secondary key, then sort on the primary key and get the result I need.

PS: To avoid any confusion, I'm using stable in the sense of "a sort is stable if it guarantees not to change the relative order of elements that compare equal".

Sundar R
  • 13,776
  • 6
  • 49
  • 76

5 Answers5

166

Yes, the intention of the manual is indeed to guarantee that sorted is stable and indeed that it uses exactly the same algorithm as the sort method. I do realize that the docs aren't 100% clear about this identity; doc patches are always happily accepted!

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 3
    I've found that if I'm sorting tuples or lists, whenever the "primary" sorting keys are equal, it ends up sorting by the "secondary" key. For example, `sorted([(1, 2), (1, 1)])` returns `[(1, 1), (1, 2)]` instead of returning the original input in the same sequence/order. Shouldn't the guarantee of stability mean that it should be returning the original `[(1, 2), (1, 1)]` input? In that case, you have be explicit and say `sorted([(1, 2), (1, 1)], key=lambda t: t[0])` – code_dredd Sep 01 '17 at 00:38
  • 20
    Isn't this what is expected in this case? Python is going to compare tuples through all elements by default, not just the first "primary" one. If you only want to sort on the first element, you can pass the `key` parameter explicitly. – Matias Grioni Nov 30 '17 at 03:52
  • 2
    @code_dredd this is the expected behavior. The point of stable-sort is sort using a "sorting key" but two different elements that have the same sorting key will be in the same order. The default sorting key for a tuple is all the elements of the tuple. – Guy Nov 05 '18 at 14:22
43

They are stable.

By the way: you sometimes can ignore knowing whether sort and sorted are stable, by combining a multi-pass sort in a single-pass one.

For example, if you want to sort objects based on their last_name, first_name attributes, you can do it in one pass:

sorted_list= sorted(
    your_sequence_of_items,
    key= lambda item: (item.last_name, item.first_name))

taking advantage of tuple comparison.

This answer, as-is, covers the original question. For further sorting-related questions, there is the Python Sorting How-To.

tzot
  • 92,761
  • 29
  • 141
  • 204
  • 8
    This can have undesired effect if you want to reverse the sort. For example when sorting on products, you may want to first sort on rating (ascending order) and then on price (also ascending). If you reverse this, you want to sort on rating in descending order but on price in ascending order. This doesn't work with this solution. – Remco Wendt Mar 08 '12 at 13:54
  • 4
    @RemcoWendt: there was no requirement for what you describe. In any case, consider `key= lambda item: (-item.rating, item.price)` or supplying a `cmp` instead of a `key` argument. I'm still not sure about the purpose of your comment, though. – tzot Mar 08 '12 at 23:07
  • 3
    Indeed it was not a requirement, but wanted to point out this subtle difference when other people read this and make a choice between your solution or using Python's stable sort feature. – Remco Wendt Mar 12 '12 at 21:12
  • I see. In other words, sorting by pairs is clearer and is thus preferable, unless you care about performance. I'd imagine that two stable sorts are somewhat faster than one sort by pairs, though the difference may be negligible - ? – Sergey Orshanskiy Dec 03 '13 at 17:39
  • 10
    @tzot I want to mention, there are always such requirements for stable sorting. For example, I have a list of tuple (rate, comment), the comments are saved in the order of when they are made, and I want to sort by the rate and keep the time order, however, I didn't save the timestamp in the list. To say it briefly, I just want to sort the list by rate, and to keep the comment in the same order. – wsysuper Apr 07 '15 at 12:21
  • The 3.6 docs on sort explicitly tells you that the stability is a wonderful property and gives an example of a complex sort. So I beg to disagree with this answer on _not needing to know_. Encoding information into index order as @wsysuper mentions is also common and needs stability. – Wolfgang Kuehn Jun 11 '18 at 09:49
7

The documentation changed in the meantime (relevant commit) and the current documentation of sorted explicitly guarantees it:

The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

This part of the documentation was added to Python 2.7 and Python 3.4(+) so any compliant implementation of that language version should have a stable sorted.

Note that for CPython the list.sort has been stable since Python 2.3

  • Tim Peters rewrote his list.sort() implementation - this one is a "stable sort" (equal inputs appear in the same order in the output) and faster than before.

I'm not 100% sure on sorted, nowadays it simple uses list.sort, but I haven't checked the history for that. But it's likely that it "always" used list.sort.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
4

The Python 3.6 doc on sorting now states that

Sorts are guaranteed to be stable

Furthermore, in that document, there is a link to the stable Timsort, which states that

Timsort has been Python's standard sorting algorithm since version 2.3

Don Hatch
  • 5,041
  • 3
  • 31
  • 48
Wolfgang Kuehn
  • 12,206
  • 2
  • 33
  • 46
0

The "What's New" docs for Python 2.4 effectively make the point that sorted() first creates a list, then calls sort() on it, providing you with the guarantee you need though not in the "official" docs. You could also just check the source, if you're really concerned.

Peter Hansen
  • 21,046
  • 5
  • 50
  • 72
  • 1
    Could you point to where it says so? It says sorted() "works like the in-place list.sort()" and "a newly formed copy is sorted", but I don't see it saying that it internally uses sort(). – Sundar R Dec 16 '09 at 16:06
  • The "copy" that is formed is a list (that's what you get as a return value), and .sort() is called on that list prior to returning. QED. No, it's not an unassailable proof, but until Python has an official standard you won't get that. – Peter Hansen Dec 16 '09 at 17:01