46

Due to changes in dict implementation in Python 3.6 it is now ordered by default. Do sets preserve order as well now?

I could not find any information about it but as both of those data structures are very similar in the way they work under the hood I thought it might be the case.

I know there is no promise for dicts to be ordered in all cases but they are most of the time. As stated in Python docs:

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon

MSeifert
  • 145,886
  • 38
  • 333
  • 352
Quba
  • 4,776
  • 7
  • 34
  • 60
  • @byxor You should not depend on *random* order, sets are *arbitrarily* ordered but far from random due to the hashing – Chris_Rands Jun 27 '18 at 15:28
  • 1
    If you are interested in *why* sets are not insertion ordered, see [Why don't Python sets preserve insertion order?](https://stackoverflow.com/q/61414947/674039) – wim May 22 '20 at 16:38

2 Answers2

30

No, sets are still unordered.

You can verify this just by displaying a set that should have a "well-defined hash order"1 to make sure we don't accidentally get a set that looks ordered but actually isn't:

>>> a_set = {3,2,1}
>>> a_set
{1, 2, 3}
>>> list(a_set)
[1, 2, 3]

If it were ordered you would expect {3, 2, 1} and [3, 2, 1] as result of the examples.

While dicts are actually ordered (same example just a bit modified):

>>> a_dict = {3: 3, 2: 2, 1:1}
>>> a_dict
{3: 3, 2: 2, 1: 1}
>>> list(a_dict)
[3, 2, 1]

1 "well-defined hash order":

For integers that satisfy 0 <= integer < sys.hash_info.modulus the hash is just the number itself. That means if the set is ordered "based" on the hash (and not ordered based on the insertion "time") and the hash values don't collide (that's why I used small numbers and numbers that only differ by one) the order should be deterministic because they occupy slots inside the set that are next to each other:

  • Either from smallest to highest
  • or a from a specific value to the highest and then from the smallest to the specific value. This case happens if the next (in the sense of neighboring) free slot in the set is the first one.

As an example for the latter:

>>> a_set = {6,7,8,9}
>>> a_set
{8, 9, 6, 7}
MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Negative `int`s also hash to themselves (apart from `-1`), although I'm not sure what the exact lower boundary is – Chris_Rands Aug 09 '17 at 12:01
  • 1
    @Chris_Rands Yes, but because `-1` and `-2` both hash to `-2` there's a collision.:) – MSeifert Aug 09 '17 at 12:02
  • 1
    Yes and `-1` behaves this way because it's an error code in C; I believe the boundary is `(sys.maxsize // 4) - 1`, at least this is what Martijn Pieters told me previously – Chris_Rands Aug 09 '17 at 12:10
  • That's good to know. But it makes sense if one wants to catch errors during `hash`. I also found the maximum value. It's `sys.hash_info.modulus`. :) – MSeifert Aug 09 '17 at 12:11
12

sets are not ordered in Python 3.6, not even as a CPython implementation detail. A simple example illustrates this:

>>> import string
>>> string.digits
'0123456789'
>>> set(string.digits)
{'7', '0', '2', '8', '6', '9', '1', '5', '4', '3'}

The Python 3 docs are clear on this:

A set is an unordered collection with no duplicate elements.

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
  • But the docs on `dict` also say that "It is best to think of a dictionary as an **unordered** set of key: value pairs, with the requirement that the keys are unique (within one dictionary)." ([source](https://docs.python.org/3/tutorial/datastructures.html#dictionaries)). That's just the language spec, the implementation *could* be ordered... – MSeifert Aug 09 '17 at 11:51
  • @MSeifert "best to think of" is the key phrasing I think, there is no such caveat for the `set` docs – Chris_Rands Aug 09 '17 at 11:53
  • I thought that's just "fluff" (or included because PyPy had an ordered dictionary for a long time). But, like I said, that's just the language spec. It doesn't mean an implementation could implement it in an ordered fashion (i.e. with buckets or similar). – MSeifert Aug 09 '17 at 12:00
  • Comments are outdated, the word "unordered" is removed from that section of the docs now. – wim Nov 19 '19 at 15:39
  • The wording remains the same in the python 3.6 docs i think https://docs.python.org/3.6/tutorial/datastructures.html#dictionaries – Chris_Rands Nov 19 '19 at 16:19
  • Makes sense, because dict ordering was not officially ratified until 3.7 – wim Nov 19 '19 at 21:32