5

Sorry if this is a basic question , but i am trying to understand how set type works in python

From docs:

A set object is an unordered collection of distinct hashable objects.

Being an unordered collection, sets do not record element position or order of insertion.

But if they are unordered, why I am getting always the same order in this test? I am expecting some random order.

users_ids = set([1, 1, 2, 3])
>>> print users_ids
set([1, 2, 3])
Community
  • 1
  • 1
user2990084
  • 2,699
  • 9
  • 31
  • 46

1 Answers1

10

A random order is not unordered. Unordered means there is no defined way the data would be ordered i.e. the insertion order or the data does not have any correlation with how the data is arranged.

The reason the data is always in a predictable order because it so happened that the particular implementation have chosen to always arrange the elements in a manner such that the order of insertion dictates the data ordering. But, there is no guarantee# that would happen and we do see this deviating in Python 3.X dictionary implementation.

Note Even if we see that the data is sorted,

>>> {1,2,3,4,5}
set([1, 2, 3, 4, 5])

we would still call it unordered, unless the documents strictly says so and provides guarantee of its order or there may be surprises waiting for you. I have seen implementations which relied on the fact that sets and dictionaries maintained ordered based in insertion pattern. Such implementations has serious consequences when they were ported to Python 3.X.

#

What’s New In Python 3.3

Security improvements:
    Hash randomization is switched on by default.
Community
  • 1
  • 1
Abhijit
  • 62,056
  • 18
  • 131
  • 204
  • Will you share a link describing the Python 3.X dictionary implementation and how it differs from Python 2.X? – That1Guy Apr 28 '15 at 16:03
  • Let me see if I understand. In python 2.7, sets are unordered but the particular implementation keeps the order of insertion?! In that case they are probably ordered, but is not guaranteed in future versions. I am correct in my understand? – user2990084 Apr 28 '15 at 16:11
  • 2
    @That1Guy the difference in 3.3 onwards is due to randomness in hashing - see http://stackoverflow.com/a/14959001/3001761 – jonrsharpe Apr 28 '15 at 16:13
  • @user2990084: It is wrong to say that it keeps the order of insertion, but what I meant was, every time you insert the same data in a particular order, the dictionary will maintain in a defined way. – Abhijit Apr 28 '15 at 16:13
  • @user2990084: I have updated the answer with the link. – Abhijit Apr 28 '15 at 16:14
  • @user2990084 The reason you're seeing the preserved order is because you're using sequential integers. The way Python 2.X hashes, your example just *happens* to be "in order". – That1Guy Apr 28 '15 at 16:15