0

I've found some code that uses sets in Python. I tried to emulate them with lists, but I get different results when pop()ing from them!

I opened up Ipython to test how do these things work, and found something pretty strange:

In [16]: x
Out[16]: set([])

In [17]: x.add("a")

In [18]: x.add("b")

In [19]: x.add("c")

In [20]: x
Out[20]: set(['a', 'c', 'b'])

Shouldn't 'b' come before c, because it was added before it? I don't understand this.

jamylak
  • 128,818
  • 30
  • 231
  • 230
  • possible duplicate of [python set changes element order?](http://stackoverflow.com/questions/9792664/python-set-changes-element-order) – Wooble Aug 02 '12 at 12:15

3 Answers3

14

http://docs.python.org/library/stdtypes.html#set

Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.

The underlying data structure of a set is a hash map, there is lots of information on those here.

jamylak
  • 128,818
  • 30
  • 231
  • 230
  • 1
    But how do they decide where to put the element? Surely they're implemented like arrays, it would be strange if they chose a random number for the index. – corazza Aug 02 '12 at 11:49
  • 1
    @Bane Just posted a link to that, they use `hash` maps which calculate a hash for each object. `set`s are similar to dictionaries which you may know about. – jamylak Aug 02 '12 at 11:49
7

If you look at wikipedia's set entry, they say

An abstract data structure is a collection, or aggregate, of data. The data may be booleans, numbers, characters, or other data structures. If one considers the structure yielded by packaging[1] or indexing,[2] there are four basic data structures:[3][4]

unpackaged, unindexed: bunch
packaged, unindexed: set
unpackaged, indexed: string (sequence)
packaged, indexed: list (array)

So sets are unindexed, or not ordered in a specific manner.

The python documentation agrees with this (always check the documentation, Python has some of the best I've seen):

5.7. Set Types

A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. (For other containers see the built in dict, list, and tuple classes, and the collections module.)

Community
  • 1
  • 1
crashmstr
  • 28,043
  • 9
  • 61
  • 79
1

In addition to the good answers jamylak and crashmstr gave you, you can see it by yourself with an example.

>>> stringA="A"
>>> stringB="B"
>>> hash(stringA)
-269909568
>>> hash(stringB)
-141909181
>>> mySet = set()
>>> mySet.add(stringB)
>>> mySet.add(stringA)
>>> mySet
set(['A', 'B'])

So I inserted "B" in the set before than "A". Why does it show "A", "B" (when a list would keep the order?). Well, take a look to the hashes calculated for the strings "A" and "B". Which one is smaller? What would happen if you do the same thing with a dictionary where the keys are those hashes?:

>>> myDict = {-141909181: "B", -269909568: "A"}
>>> myDict
{-269909568: 'A', -141909181: 'B'}

Maybe this helps a bit understanding the sets.

Community
  • 1
  • 1
Savir
  • 17,568
  • 15
  • 82
  • 136