19

I'd like to know if the absence of element ordering of the Python's built-in set structure is "random enough". For instance, taking the iterator of a set, can it be considered a shuffled view of its elements?

(If it matters, I'm running Python 2.6.5 on a Windows host.)

sblundy
  • 60,628
  • 22
  • 121
  • 123
Chuim
  • 1,983
  • 3
  • 17
  • 20

5 Answers5

36

No, it is not random. It is "arbitrarily ordered", which means that you cannot depend on it being either ordered or random.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 4
    It's important to understand the difference between "undefined" and "random". – Matti Virkkunen May 18 '10 at 19:22
  • 7
    Indeed, the order is predictable from the ID's of the various objects in the set. It's quite rigorously defined by the code. BUT -- bonus -- the details are none of your business, making them "arbitrary" and "implementation-specific" and "undependable for anything". And "undefined as far as you're allowed to care." – S.Lott May 18 '10 at 19:25
  • OK. The hash function will determine the order. For instance, for integer elements we'll get the natural order. So, I conclude that we'll have an "undefined", "arbitrary" and "repeatable" ordering for the same set of elements. – Chuim May 18 '10 at 20:01
  • 2
    It might only be repeatable under a single implementation of Python. If the spec says it's undefined, don't assume anything else about it (not even repeatability). – Matti Virkkunen May 18 '10 at 20:03
  • 1
    Undefined means "changeable without notice". So an upgrade from 2.6.1 to 2.6.2 i allowed to change things that are otherwise undefined. – S.Lott May 19 '10 at 11:03
  • The comments of this answer are as good as the answer it self. Too bad you can't share rep. – Bite code Nov 02 '11 at 14:20
6

In a word, no:

>>> list(set(range(10000))) == list(range(10000))
True
Daniel Stutzbach
  • 74,198
  • 17
  • 88
  • 77
6

Just a note about the rigorously of the order. It seems that it is very unreliable even in the same running environment.

For example this code gives different answers:

data = 'KSRNDOW3GQ'
chars = set(data)
print(list(chars))

enter image description here

xslittlegrass
  • 4,826
  • 4
  • 26
  • 32
4

No, you can not rely on it for any real statistical purpose. The implementation of sets in Python is in terms of a hash table, and can cause the element distribution to display some very non-random properties. There's a large gap between "not having a guaranteed order" and "guaranteed to be unordered in a uniform-random manner".

Use random.shuffle to really shuffle elements of a sequence.

Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
  • The thing is `random.shuffle` may only be used for sequences, which a `set` is not. One may convert it to a `list` but for a big number of elements and performance sensitive code it may be an issue... – Chuim May 18 '10 at 20:04
4

Arbitrariness is central when designing programs, each of these freedoms that you reserve is like a joker card that you can use when you implement, develop, or rewrite your program. The more of these free-cards you collect, the more efficiency can you deliver from your code (probably), since you have more freedom to change it.

It is not random, it's only freedom. If it's a better set that way, the order can be forwards on Wednesdays and "backwards" on Fridays.

u0b34a0f6ae
  • 48,117
  • 14
  • 92
  • 101