0

Considering the fact that set is an unordered data structure, I began to wonder if it is possible to create a truly "random" number with the usage of it (or with the usage of a dictionary).

Lets consider such an input number represented as a string:

input = "0123456789"

And then we convert the string into a set:

input_set = set(input)

After printing the result of this operation multiple times, we have following example outputs:

{'9', '3', '4', '6', '0', '7', '1', '8', '5', '2'}

{'3', '4', '2', '1', '5', '7', '0', '8', '6', '9'}

Now we can convert the elements of the set into a string with the usage of:

output = ''.join(set_input)

And the result of this operation for the sets above would be:

9346071852

3421570869

Would generation of a random number in a following way be considered a good practice?

And most of all, do I understand correctly that it would be a "pseudorandom" number, because we could reproduce the result with the usage of some seed value or key?

Community
  • 1
  • 1
Epion
  • 458
  • 3
  • 7
  • 9
    `set` is not random at *all*; it's *arbitrary*. The language makes no guarantees about any property of the output. An implementation would be free to sort the elements if it so chose. – chepner May 09 '18 at 16:31
  • @chepner so why does it denerate a different output on every run? – GalAbra May 09 '18 at 16:34
  • 1
    Just because it is different doesn't mean it is random. Plus, it's different in the interpreter *you* are using; that doesn't mean *every* interpreter has to do so. – chepner May 09 '18 at 16:36
  • I just tried this 10000 times in a loop. I got exactly **1** different outputs. So much for randomness... – Thierry Lathuille May 09 '18 at 16:36
  • 1
    @GalAbra: [Why is the order in dictionaries and sets arbitrary?](//stackoverflow.com/q/15479928) – Martijn Pieters May 09 '18 at 16:36
  • 1
    Possible duplicate of [Ensuring random order for iteration over Set Python](https://stackoverflow.com/questions/31598562/ensuring-random-order-for-iteration-over-set-python) – GalAbra May 09 '18 at 16:37
  • Also, see https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED. – chepner May 09 '18 at 16:39

1 Answers1

3

No, this is not good practice.

The order is determined by the random hash seed, and this seed is fixed for the current Python interpreter. You would not be able to produce more than one 'random' order per interpreter. The hash seed is also a implementation detail, there to prevent a class of denial of service attacks. Different Python implementations (including future releases produced by Python.org) are free to come up with a different implementation of sets that doesn't use a hash seed. See the -R switch documentation for more details.

The seed is also aimed at producing good hashing performance, not cryptographic security. The ordering of values in a set is also determined by the insertion order of the elements; it is a combination of the hash for each value and any clashes when translating the hash into the (limited) choices of slots available in a hash table that determines the ordering of the elements. If you were to repeat your experiments, you'd almost certainly see a bias towards certain numbers appearing in certain positions.

Stick to the secrets module to produce cryptographically secure random numbers.

For non-secure operations, just use the random module; using random.shuffle() on your digits would already give a far better distribution of the numbers, statistically speaking.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343