5

For an scientific experiment I need to generate 10 random, fixed-size, subsets of a list. For the experiment to be repeatable I want to initialize 10 different instances of random.Random() with a known seed.

How different do random seeds need to be? seems to suggest that using seeds 1 to 10 might be a bad idea, as the results could be linear dependent.

If it is a bad practice to pick seeds 1 to 10 for this case, what would be a good strategy for selecting seeds in a repeatable manner?

Clarification: it is important that always the same seeds are used when the program is run (with a specific dataset)! In the end my program must be deterministic.

Community
  • 1
  • 1
Peter Smit
  • 27,696
  • 33
  • 111
  • 170
  • 2
    How about just using some random from an unknown seed to generate your seed values? – Sven Oct 29 '13 at 13:06
  • I added a clarification to the question. For repeatability I need always the same seed values when the program is run for a specific data set. – Peter Smit Oct 29 '13 at 13:20
  • 1
    Yes, that's what I meant. Just write down your numbers. If you need true random data, try http://www.random.org , they also have sequence and distribution generators. – Sven Oct 29 '13 at 14:13

2 Answers2

4

Using random.org, I generated 10 random numbers from 2**0 to 2**28, giving as seeds:

187372311
204110176
129995678
6155814
22612812
61168821
21228945
146764631
94412880
117623077

Using a linear sequence of seeds can be problematic as noted in the comments. The numbers from random.org:

[...] come from atmospheric noise, which for many purposes is better than the pseudo-random number algorithms typically used in computer programs.

Hooked
  • 84,485
  • 43
  • 192
  • 261
  • Although storing the seeds is a method that could work, it is also not very practical in my case. I would like to know if there is, specifically for the Python random methods, a way to select seeds in a deterministic manner that produce independent random data sets. – Peter Smit Oct 30 '13 at 07:21
  • @PeterSmit It sounds like you want _reproducibility_ in your code. If that is the case know that the same seeds should give you the same sequence across different computers since they all use the Mersenne Twister generator. If you don't want to store 10 different seeds, store _one_ seed and if you need say 500 random numbers per program and 10 programs, generate all 5000 numbers at once from the same seed, the results (in the sense of randomness of the sequence) should be the same. – Hooked Oct 30 '13 at 13:58
1

In order to compete for reputation, I'll put my comments here: ;)

How about just using some random from an unknown seed to generate your seed values? Write down your numbers, and keep them for reproducibility.

If you need true random data, try http://www.random.org, they also have sequence and distribution generators.

EDIT

It depends on what kind of random numbers you want. You want to seed a PRNG, so you probably (pun not intended) want uniformly distributed numbers from the complete range that your PRNG is able to accept as a seed. Then, just generate them, and write them down somewhere. If you want to read something about PRNGs, C++11 has a some good ones in its library: http://en.cppreference.com/w/cpp/header/random.

HTH

Sven
  • 1,748
  • 1
  • 14
  • 14