10

Lets say you have a set:

foo = {1, 2, 3, 4, 5}

In the book I am currently reading, Pro Python, it says that using foo.pop()will pop an arbitrary number from that selection. BUT...When I try it out, it pops 1, then 2, then 3...Does it do it arbitrarily, or is this just a coincidence?

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
Billjk
  • 10,387
  • 23
  • 54
  • 73
  • This has to do with the hash function that a set uses to map it's contents to locations in the memory. Try doing `hash()` on different data types and see what numbers you get. The arbritrary number popped out of a set _will_ be the _next_ element in the set. It just so happens that the "order" the set stores the data may not _necessarily_ be ordered as far as you are concerned. In this example, the order the elements are in happen to coincide with the order that the hashmap is stored/retrieved. – Joel Cornett Mar 24 '12 at 02:54
  • I found the same thing today and I want to up your question – jxie0755 Jan 02 '18 at 18:57

2 Answers2

16

The reason it says it is arbitrary is because there is no guarantee about the ordering it will pop out. Since you just created the set, it may be storing the elements in a "nice" order, and thus .pop() happens to return them in that order, but if you were to mutate the set, that might not continue to hold.

Example:

>>> foo = set()
>>> foo.add(-3)
>>> foo.add(-1)
>>> foo.add(2)
>>> foo.pop()
2
>>> foo.pop()
-3
Amber
  • 507,862
  • 82
  • 626
  • 550
15

Set and dictionaries are implemented using hash tables. They are unordered collections, meaning that they have no guaranteed order.

The order you're seeing is a non-guaranteed implementation detail. In CPython, the hash value for an integer is the integer itself:

>>> [hash(i) for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

That implementation detail causes the integers to appear ordered in your set. Other sets would be semi-ordered, {5, 6, 7, 8, 9} appears as set([8, 9, 5, 6, 7]).

In contrast, other datatypes such as str have different hash functions and will appear to be more scrambled. For example:

# Example of scrambling str objects in a 64-bit build
>>> {'red', 'green', 'blue'}
set(['blue', 'green', 'red'])

The set.pop method pops off entries left-to-right. That is also a non-guaranteed implementation detail.

The short answer to your question is Yes, the ordering is arbitrary but No, what you saw wasn't just a coincidence either, rather it was an interesting non-guaranteed implementation detail.

Hope this clears-up the mystery for you :-)

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485