0

I found a weird result of this below code on my project (below code is equivalent to the code in my project since I have to remove parts that are irrelevant to the question):

import random
random.seed(9000)

...
list1 = [0, 1]
list2 = []
set_diff = set(list1) - set(list2)
print( set_diff )
list_diff = list( set_diff )
print( list_diff )
print( random.choice( list_diff ) )

The result is unstable since the order (as printed) of set_diff is unstable (set is supposed to have no order). Result could be:

{'0', '1'}
['0', '1']
1

or

{'1', '0'}
['1', '0']
0

in different runs. Could anyone please explain why? Thanks!

ayhan
  • 70,170
  • 20
  • 182
  • 203
LeonA
  • 13
  • 4

1 Answers1

2

A set is unordered and so will yield its elements in any order. However, this order is consistent within one python invocation. That is, set_ = set(range(N)); list(set_) == list(set_) is always true within in the same python program. Python 3.2+ explicitly makes sure that the ordering will inconsistent from one python instance to the next (this is a security consideration relating to denial of service attacks involving dictionary construction). This is the behaviour you are seeing.

To avoid this you need to set the environment variable PYTHONHASHSEED to the same value before you start your program. This is in addition to setting the random seed before using random.choice.

export PYTHONHASHSEED=1
python myscript.py

A simpler solution, however, is to create a sorted list before doing random.choice. ie.

random.choice(sorted({1, 2, 3}))
Dunes
  • 37,291
  • 7
  • 81
  • 97