117

I have a rather big program, where I use functions from the random module in different files. I would like to be able to set the random seed once, at one place, to make the program always return the same results. Can that even be achieved in python?

DSM
  • 342,061
  • 65
  • 592
  • 494
Mischa Obrecht
  • 2,737
  • 6
  • 21
  • 31

9 Answers9

157

The main python module that is run should import random and call random.seed(n) - this is shared between all other imports of random as long as somewhere else doesn't reset the seed.

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • 5
    could I be resetting the seed somewhere without knowing it ? because setting the seed once in the main file, does not do the trick – Mischa Obrecht Jul 17 '12 at 16:44
  • 2
    @MischaObrecht I guess so - the seed is only initialised on the **first** import of the random module - if it's imported more than once, it won't perform the initialisation and reset the seed - so there must be an explicit call somewhere in your code – Jon Clements Jul 17 '12 at 16:49
  • 5
    If you're calling methods from `random` in module level code, that you're importing in main, before you get to the `random.seed(n)` in main, then those calls will be made before the seed, and so will be time-seeded and effectively un-reproducibly random. – Russell Borogove Jul 17 '12 at 18:30
  • 17
    If it turns out that some third-party code is reseeding the RNG (unlikely but possible), note that you can create additional random number generators with independent state via the `random.Random()` constructor, and use those when strict reproducibility is important. – Russell Borogove Jul 17 '12 at 18:35
  • This does not work for me. And I have no reproducible code. I am guessing I will have to check documentation of all imported libraries... (see http://stackoverflow.com/questions/37886997/having-problems-keeping-a-simulation-deterministic-with-random-random0-in-pyth/37888152?noredirect=1#comment63390942_37888152 – B Furtado Jun 23 '16 at 16:04
  • does this guarantee to set the random key if one is using multiple libraries like numpy, scipy, tensorflow etc? – Charlie Parker Apr 05 '17 at 04:20
  • @Russell Borogove argument is enough for me to not simple set the seed of `import random` (create an application object instead). Although unlikely, an update in a dependency could break an application, if the update introduces a reset in the seed and the application depends on an stable seed. – toto_tico May 17 '17 at 08:05
  • something that puzzled me a long time was the fact that `set` generates some randomness too. `random.choice(set(...))` or even `list(set(...))[0]` might return a different value even if the seed is set. To be safe, we can use `sorted(set(...))` – mountrix Mar 14 '23 at 22:42
55

zss's comment should be highlighted as an actual answer:

Another thing for people to be careful of: if you're using numpy.random, then you need to use numpy.random.seed() to set the seed. Using random.seed() will not set the seed for random numbers generated from numpy.random. This confused me for a while. -zss

Ivo
  • 3,890
  • 5
  • 22
  • 53
Sida Zhou
  • 3,529
  • 2
  • 33
  • 48
  • Absolutely true, If somewhere in your application you are using random numbers from the `random module`, lets say function `random.choices()` and then further down at some other point the `numpy` random number generator, lets say `np.random.normal()` you have to set the seed for both modules. What i typically do is to have a couple of lines in my `main.py` like `random.seed(my_seed)` and `np.random.seed(my_seed)`. Kudos to zss – Aenaon Jul 13 '19 at 11:07
  • 1
    Sage has a similar issue, as its PRNG is distinct from both Python's and numpy's. Use `set_random_seed()` for Sage. – Brent Baccala Aug 19 '19 at 16:01
10

In the beginning of your application call random.seed(x) making sure x is always the same. This will ensure the sequence of pseudo random numbers will be the same during each run of the application.

Chimera
  • 5,884
  • 7
  • 49
  • 81
6

Jon Clements pretty much answers my question. However it wasn't the real problem: It turns out, that the reason for my code's randomness was the numpy.linalg SVD because it does not always produce the same results for badly conditioned matrices !!

So be sure to check for that in your code, if you have the same problems!

Mischa Obrecht
  • 2,737
  • 6
  • 21
  • 31
  • 33
    Another thing for people to be careful of: if you're using numpy.random, then you need to use numpy.random.seed() to set the seed. Using random.seed() will not set the seed for random numbers generated from numpy.random. This confused me for a while. – zss Nov 19 '14 at 05:01
5

Building on previous answers: be aware that many constructs can diverge execution paths, even when all seeds are controlled.

I was thinking "well I set my seeds so they're always the same, and I have no changing/external dependencies, therefore the execution path of my code should always be the same", but that's wrong.

The example that bit me was list(set(...)), where the resulting order may differ.

JBSnorro
  • 6,048
  • 3
  • 41
  • 62
1

One important caveat is that for python versions earlier than 3.7, Dictionary keys are not deterministic. This can lead to randomness in the program or even a different order in which the random numbers are generated and therefore non-deterministic random numbers. Conclusion update python.

Davoud Taghawi-Nejad
  • 16,142
  • 12
  • 62
  • 82
-1

I was also puzzled by the question when reproducing a deep learning project.So I do a toy experiment and share the results with you.

I create two files in a project, which are named test1.py and test2.py respectively. In test1, I set random.seed(10) for the random module and print 10 random numbers for several times. As you can verify, the results are always the same.

What about test2? I do the same way except setting the seed for the random module.The results display differently every time. Howerver, as long as I import test1———even without using it, the results appear the same as in test1.

So the experiment comes the conclusion that if you want to set seed for all files in a project, you need to import the file/module that define and set the seed.

Gary
  • 1
-1

According to Jon's answer, setting random.seed(n), at the beginning of the main program will set the seed globally. Afterward to set seeds of the imported libraries, one can use the output from random.random(). For example,

rng = np.random.default_rng(int(abs(math.log(random.random()))))

tf.random.set_seed(int(abs(math.log(random.random()))))
acciptris
  • 9
  • 1
  • 3
-17

You can guarantee this pretty easily by using your own random number generator.

Just pick three largish primes (assuming this isn't a cryptography application), and plug them into a, b and c: a = ((a * b) % c) This gives a feedback system that produces pretty random data. Note that not all primes work equally well, but if you're just doing a simulation, it shouldn't matter - all you really need for most simulations is a jumble of numbers with a pattern (pseudo-random, remember) complex enough that it doesn't match up in some way with your application.

Knuth talks about this.

user1277476
  • 2,871
  • 12
  • 10
  • 13
    Rolling your own is unnecessary, because Python has excellent random number facilities in its standard library, and it's very easy to create a really bad generator if you don't know what you're doing. – Russell Borogove Jul 17 '12 at 18:31
  • 7
    I agree that's a pretty bad solution: In Monte Carlo simulations (which is what my program is), where one usually collects millions of samples, correlated random numbers (stemming from a bad generator) can easily mess up your results !! – Mischa Obrecht Jul 18 '12 at 11:07
  • You mean, Knuth is talking about this all the time? Even now? – means-to-meaning Oct 01 '14 at 11:16