1

I think an example says it all about why I'm confused:

L=set()
L.add("A")
L.add("C")
L.add("B")

L
Out[6]: {'A', 'B', 'C'}

print(L)
{'B', 'C', 'A'}

Sorry for not respecting PEP8 in my example. Thanks!

EDIT : I'm using PyCharm and Python 3.10.2

  • 4
    sets are not ordered – tgpz Jan 26 '22 at 23:37
  • They're unordered, but nonetheless I'm unable to reproduce the behavior you've shown on CPython 3.9.0. What's your Python version and environment, please? The behavior you're seeing might be an implementation detail that you can't rely on. – ggorlen Jan 26 '22 at 23:37
  • 2
    Does this answer your question? [Does Python have an ordered set?](https://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) – tgpz Jan 26 '22 at 23:38
  • 3
    You're seeing the difference between IPython's custom display logic and the set's built-in `__str__` handling. – user2357112 Jan 26 '22 at 23:39
  • 1
    Does this answer your question? [Output difference between ipython and python](https://stackoverflow.com/questions/21110915/output-difference-between-ipython-and-python) – Ryan Haining Jan 26 '22 at 23:43
  • No, **they are not ordered**. Which means you shouldn't depend on *any particular order*, although the order of iteration is guaranteed stable as long as the size of a given set doesn't change. – juanpa.arrivillaga Jan 27 '22 at 00:27
  • @juanpa.arrivillaga Where is that guaranteed? – Kelly Bundy Jan 27 '22 at 01:16
  • @KellyBundy looking into it I was mistaken. I used to recall that dicts guaranteed this, but since dicts started to maintain insertion order, that point is moot. sets never explicitly guaranteed it. – juanpa.arrivillaga Jan 27 '22 at 03:21
  • i don't understand what IPython is. I've tried to read the post linked but i don't understand – FluidMechanics Potential Flows Jan 27 '22 at 10:45
  • I've added my IDE and my Python version – FluidMechanics Potential Flows Jan 27 '22 at 10:47
  • @juanpa.arrivillaga Yes, for dicts I knew it, they guaranteed it for the correspondence between `keys()` and `values()`. But sets never had such reason for a guarantee. I was somewhat hoping you were right, but on the other hand, I don't remember ever having a use for it, and it could prevent optimizations, so I guess it's good that it's not guaranteed. – Kelly Bundy Jan 27 '22 at 12:12

3 Answers3

1

Sets are not ordered in python. In CPython your example also shows:

Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:36:06) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> L=set()
>>> L.add("A")
>>> L.add("C")
>>> L.add("B")
>>> L
{'C', 'B', 'A'}
>>> print(L)
{'C', 'B', 'A'}

In IPython on the other hand there is a custom display logic that sorts the output.

When using PyCharm it automatically recognizes an installed IPython in the respective environment and uses this for the console output.

Alexander
  • 2,174
  • 3
  • 16
  • 25
0

Python sets are not ordered

from the docs

Being an unordered collection, sets do not record element position or order of insertion.

Ryan Haining
  • 35,360
  • 15
  • 114
  • 174
0

Sets are implemented via hash tables, as opposed to a list-like data structure. The difference being that you can look up an item in constant time. That is, the value of the object determines where it's placed in the set (it's slightly more complicated, so the exact placement in fact also depends on the insertion order...).

A similar thing is going on with dictionaries.

While print(L) shows you the actual data layout of the set in memory, the automatic Out[] display of Ipython/Jupyter use a custom printing method, which apparently sorts sets before printing. See the link in the comment by Ryan Haining for some more detail.

A somewhat different thing you might notice is that the order changes whenever you run your program. This is because the hashes are salted, which in turn makes it harder to hack your code based on raw views of the memory. This can be disabled if you want, to always get the same ordering. On Linux/Mac, you can do this by running

export PYTHONHASHSEED=0

before running Python.

jmd_dk
  • 12,125
  • 9
  • 63
  • 94
  • Hash salting is a different thing with different objectives. Salts are used to protect password hashes. They're a per-password additional chunk of data, hashed along with the password and stored alongside the hash. They protect against attacks based on precomputing password hashes, and they force attackers to attack each password individually instead of attacking many at once. – user2357112 Jan 27 '22 at 00:03
  • While salting might typically be used with passwords, it is nonetheless the same technique used here. See the [docs](https://docs.python.org/3/reference/datamodel.html#object.__hash__), where they in fact use the same term. – jmd_dk Jan 27 '22 at 00:07
  • The docs are using the wrong term. It's a different technique with many differences in how it works and what it's for. For example, salts are per-password, while Python's hash seed is global to the process. Salts are persistent, producing a consistent password hash even if you restart the process, while Python's hash seed dies with the interpreter process. Salts protect against recovering the input of the hash function, while hash seeding protects against attacks where the attacker gets to choose the input. – user2357112 Jan 27 '22 at 00:20
  • It's an extremely simple technique. Just sprinkle in some secret bits before taking the hash (hence the name). It's totally fair of the docs to refer to what's going in here as salting, though the technique is typically associated with passwords. Sure the salt is not attached to a password, but to a Python process. Big deal. – jmd_dk Jan 27 '22 at 00:35