2

In a python console, you have to set the seed at each run if you want to keep the output the same.

In[0]: import numpy as np

In[1]: np.random.seed(1); np.random.randint(1, 10, 5)
Out[0]: array([6, 9, 6, 1, 1])

In[2]: np.random.seed(1); np.random.randint(1, 10, 5)
Out[1]: array([6, 9, 6, 1, 1])

In[3]: np.random.randint(1, 10, 5)
Out[2]: array([2, 8, 7, 3, 5]) # Different output if the seed was not set

However, when it comes to running code with multiple files, the random functions in one file would be affected by the seed set in another imported module, which may cause some unexpected issues.

Say, I have two files

# main.py
from myfunc import *
import numpy as np

myfunc()
print('main.py:', np.random.randint(1, 10, 5))

and

# myfunc.py
import numpy as np

def myfunc():
    np.random.seed(2019)
    numbers = np.random.randint(1, 10, 5)
    print('myfunc:', numbers)

If I run main twice, I'll get the same results

myfunc.py: [9 3 6 9 7]
main.py: [9 1 1 8 9]

and

myfunc.py: [9 3 6 9 7]
main.py: [9 1 1 8 9]

This implies that the randint was seeded even it was not set in the main.py. Seeing this, I guess that was because np.random.seed() works somewhat globally. And I should use it carefully, particularly when I just want it to work locally.

My solution so far is reset the seed whenever I finish using it. Like

np.random.seed(2019)
numbers = np.random.randint(1, 10, 5)
np.random.seed()

I am not sure what the working range of np.random.seed() is. And is there any other way to avoid the global setting issues?

Guoyang Qin
  • 295
  • 2
  • 13
  • You may want to consider something like the context manager example given at: https://stackoverflow.com/questions/49555991/can-i-create-a-local-numpy-random-seed – Jon Clements Nov 23 '19 at 14:57
  • 1
    Does this answer your question? [Is there a scope for (numpy) random seeds?](https://stackoverflow.com/questions/50971213/is-there-a-scope-for-numpy-random-seeds) – BStadlbauer Nov 24 '19 at 10:52
  • @BStadlbauer Yes, it is very relevant, thanks. Especially this line "1) Yes. `moduleA` and `moduleB` uses the same seed. Importing `random` in `moduleA` creates the global `random.Random()` object. **Reimporting it in `moduleB` just gives you the same module** and maintains the originally created `random.Random()` object." It well clears up my confusion. – Guoyang Qin Nov 24 '19 at 16:00

2 Answers2

4

When using np.random.seed() you seed the global numpy.random.RandomState. As a side-note, the global (default) RandomState can be accessed like this:

numpy_default_rng = numpy.random.random.__self__

To only locally seed your RandomState you can create your own instance of it and use its methods to draw numbers. (see also here)

E.g.:

random_state = numpy.random.RandomState(seed=2)
random_state.randint(10)

will always return the same result without seeding other calls to np.random

BStadlbauer
  • 1,287
  • 6
  • 18
  • I see. But I noticed that although the results will be duplicated due to the global seeding. However, the results vary across the files, see `myfunc.py: [9 3 6 9 7] main.py: [9 1 1 8 9]`, do they share the same instance? – Guoyang Qin Nov 23 '19 at 15:12
  • They do share the same instance which is seeded after `myfunc()` is called. The result is different as you creating two distinct sets of random numbers, one after another (equivalent to `np.random.randint(1, 10, 10)`). To get the same result, you would have to reseed the generator before calling it in `main.py` – BStadlbauer Nov 23 '19 at 15:21
  • Additionally, they are two separate files. I was confused why the instantiation in one module would affect another one that imports the module? This sounds so flexible and sort of dangerous. Shouldn't the variables and instances in the two files be in a separate space? – Guoyang Qin Nov 23 '19 at 15:22
  • So in Python imports happen only [once](https://stackoverflow.com/questions/2029523/how-to-prevent-a-module-from-being-imported-twice) and the default random number generator is presumably a [Singleton](https://en.wikipedia.org/wiki/Singleton_pattern), which is created only once on import of `numpy.random`. You could think of it as sort of a global variable, and I guess it was chosen this way to be able to seed all (possibly many) random number generators with one line (so they can easily be seeded for debugging but left normal at regular runtime) – BStadlbauer Nov 23 '19 at 15:29
  • @BStadlbauer Can I do e.g. `with n.r.RS(seed=2) as r_s: r_s.randint(10)`? – jtlz2 Aug 10 '21 at 09:14
2

You can create a local instance of numpy.random.RandomState to be absolutely sure that the seed is local:

>>> import numpy as np
>>> first_state = np.random.RandomState(seed=1)
>>> first_state.rand()
0.417022004702574
>>> first_state.rand()
0.7203244934421581
>>> second_state = np.random.RandomState(seed=1)
>>> second_state.rand()
0.417022004702574
>>> second_state.rand()
0.7203244934421581

Then you can call all the functions that draw numbers from different distributions on that local object, like: state.rand(), state.normal(), state.uniform() and so on.

BStadlbauer
  • 1,287
  • 6
  • 18
ForceBru
  • 43,482
  • 10
  • 63
  • 98