489

What does np.random.seed do?

np.random.seed(0)
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
covariance
  • 6,833
  • 7
  • 23
  • 24
  • 2
    I found this article very helpful in understanding `np.random.seed()` and pseudo-random numbers: https://www.sharpsightlabs.com/blog/numpy-random-seed/ – fpersyn Aug 23 '20 at 11:40
  • 2
    is this still the recommended answer? https://towardsdatascience.com/stop-using-numpy-random-seed-581a9972805f – Charlie Parker Nov 11 '21 at 21:29

12 Answers12

844

np.random.seed(0) makes the random numbers predictable

>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55,  0.72,  0.6 ,  0.54])
>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55,  0.72,  0.6 ,  0.54])

With the seed reset (every time), the same set of numbers will appear every time.

If the random seed is not reset, different numbers appear with every invocation:

>>> numpy.random.rand(4)
array([ 0.42,  0.65,  0.44,  0.89])
>>> numpy.random.rand(4)
array([ 0.96,  0.38,  0.79,  0.53])

(pseudo-)random numbers work by starting with a number (the seed), multiplying it by a large number, adding an offset, then taking modulo of that sum. The resulting number is then used as the seed to generate the next "random" number. When you set the seed (every time), it does the same thing every time, giving you the same numbers.

If you want seemingly random numbers, do not set the seed. If you have code that uses random numbers that you want to debug, however, it can be very helpful to set the seed before each run so that the code does the same thing every time you run it.

To get the most random numbers for each run, call numpy.random.seed(). This will cause numpy to set the seed to a random number obtained from /dev/urandom or its Windows analog or, if neither of those is available, it will use the clock.

For more information on using seeds to generate pseudo-random numbers, see wikipedia.

John1024
  • 109,961
  • 14
  • 137
  • 171
  • 147
    This answer should be added to the documentation of numpy. Thank you. – gorjanz Jan 27 '17 at 12:11
  • 13
    Also, when you call `numpy.random.seed(None)`, it "will try to read data from /dev/urandom (or the Windows analogue) if available or seed from the clock otherwise". – Jonathan Apr 09 '17 at 11:06
  • 1
    @Jonathan Excellent point about `numpy.random.seed(None)`. I updated the answer with that info and a link to the docs. – John1024 Apr 10 '17 at 00:21
  • @curio1729 The implementation may vary from one operating system to the next but numpy tries to make its commands, including `seed`, compatible. – John1024 Feb 05 '19 at 21:22
  • "starting with a number (the seed), multiplying it by a large number, then taking modulo of that product" — so if I set the seed to 0, multiply it by some large number and do modulo, I should end up at 0 again? Why don't I. Does `np.random.seed(0)` not actually set the seed to 0? – L3viathan Jan 25 '20 at 13:05
  • 1
    @L3viathan Good point! To be more complete & accurate, I should have mentioned that an offset is added. Answer updated. For those who want more details, I also added a link to wikipedia's discussion of pseudo-random number generators. – John1024 Jan 25 '20 at 21:13
  • 1
    Are the numbers guaranteed to be the same on a different python installation/different machine? – MrMartin Apr 06 '20 at 12:28
  • 1
    @MrMartin if you set seed to be a fixed value, say 53 here like `np.random.seed(53)`, it will always generate the same numbers irrespective of machine, os, platform. – avats Nov 08 '21 at 18:15
  • 1
    is this still the recommended answer? https://towardsdatascience.com/stop-using-numpy-random-seed-581a9972805f – Charlie Parker Nov 11 '21 at 21:28
71

If you set the np.random.seed(a_fixed_number) every time you call the numpy's other random function, the result will be the same:

>>> import numpy as np
>>> np.random.seed(0) 
>>> perm = np.random.permutation(10) 
>>> print perm 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10) 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10) 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10) 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.rand(4) 
[0.5488135  0.71518937 0.60276338 0.54488318]
>>> np.random.seed(0) 
>>> print np.random.rand(4) 
[0.5488135  0.71518937 0.60276338 0.54488318]

However, if you just call it once and use various random functions, the results will still be different:

>>> import numpy as np
>>> np.random.seed(0) 
>>> perm = np.random.permutation(10)
>>> print perm 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> print np.random.permutation(10) 
[3 5 1 2 9 8 0 6 7 4]
>>> print np.random.permutation(10) 
[2 3 8 4 5 1 0 6 9 7]
>>> print np.random.rand(4) 
[0.64817187 0.36824154 0.95715516 0.14035078]
>>> print np.random.rand(4) 
[0.87008726 0.47360805 0.80091075 0.52047748]
Tobia Tesan
  • 1,938
  • 17
  • 29
Zhun Chen
  • 827
  • 6
  • 3
  • 11
    Is there a function that can be called once such that random seed is set for all subsequent `np.random` calls until the seed is changed? Having to call it every time seems needlessly verbose and easy to forget. – Lubed Up Slug Mar 02 '19 at 04:11
  • @LubedUpSlug you can decorate them – at least for some simple cases I tested it should work. `def seed_first(fun, seed=0):` | `\tdef wrapped(*args, **kwargs):` | `\t\tnp.random.seed(seed)` | `\t\treturn fun(*args, **kwargs)` | `\treturn wrapped`, and then `for m in np.random.__all__:` | `\tif m != 'seed':` | `\t\tsetattr(np.random, m, seed_first(getattr(np.random, m)))` However, this could lead to very subtle bugs and weird behavior in the long run. (Replace \t with four spaces, and | with line breaks...) – Sebastian Höffner Nov 27 '19 at 16:05
  • 2
    @SebastianHöffner thank you for your comment. My question was a little misguided because I was confused by the sentence "However, if you just call it once and use various random functions, the results will still be different:" Calling `np.random.seed()` once at the start of a program will always produce the same result for the same seed since the subsequent calls to `np.random` functions will deterministically change the seed for subsequent calls. Calling `np.random.seed()` before every call to `np.random` functions will probably produce undesired results. – Lubed Up Slug Nov 27 '19 at 20:08
  • 1
    is this still the recommended answer? https://towardsdatascience.com/stop-using-numpy-random-seed-581a9972805f – Charlie Parker Nov 11 '21 at 21:28
23

As noted, numpy.random.seed(0) sets the random seed to 0, so the pseudo random numbers you get from random will start from the same point. This can be good for debuging in some cases. HOWEVER, after some reading, this seems to be the wrong way to go at it, if you have threads because it is not thread safe.

from differences-between-numpy-random-and-random-random-in-python:

For numpy.random.seed(), the main difficulty is that it is not thread-safe - that is, it's not safe to use if you have many different threads of execution, because it's not guaranteed to work if two different threads are executing the function at the same time. If you're not using threads, and if you can reasonably expect that you won't need to rewrite your program this way in the future, numpy.random.seed() should be fine for testing purposes. If there's any reason to suspect that you may need threads in the future, it's much safer in the long run to do as suggested, and to make a local instance of the numpy.random.Random class. As far as I can tell, random.random.seed() is thread-safe (or at least, I haven't found any evidence to the contrary).

example of how to go about this:

from numpy.random import RandomState
prng = RandomState()
print prng.permutation(10)
prng = RandomState()
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)

may give:

[3 0 4 6 8 2 1 9 7 5]

[1 6 9 0 2 7 8 3 5 4]

[8 1 5 0 7 2 9 4 3 6]

[8 1 5 0 7 2 9 4 3 6]

Lastly, note that there might be cases where initializing to 0 (as opposed to a seed that has not all bits 0) may result to non-uniform distributions for some few first iterations because of the way xor works, but this depends on the algorithm, and is beyond my current worries and the scope of this question.

Community
  • 1
  • 1
ntg
  • 12,950
  • 7
  • 74
  • 95
20

I have used this very often in neural networks. It is well known that when we start training a neural network we randomly initialise the weights. The model is trained on these weights on a particular dataset. After number of epochs you get trained set of weights.

Now suppose you want to again train from scratch or you want to pass the model to others to reproduce your results, the weights will be again initialised to a random numbers which mostly will be different from earlier ones. The obtained trained weights after same number of epochs ( keeping same data and other parameters ) as earlier one will differ. The problem is your model is no more reproducible that is every time you train your model from scratch it provides you different sets of weights. This is because the model is being initialized by different random numbers every time.

What if every time you start training from scratch the model is initialised to the same set of random initialise weights? In this case your model could become reproducible. This is achieved by numpy.random.seed(0). By mentioning seed() to a particular number, you are hanging on to same set of random numbers always.

A Santosh
  • 839
  • 1
  • 9
  • 11
13

I hope to give a really short answer:

seed make (the next series) random numbers predictable. You can think every time after you call seed, it pre-defines series numbers and numpy random keeps the iterator of it, then every time you get a random number it just gonna call get next.

e.g.:

np.random.seed(2)
np.random.randn(2) # array([-0.41675785, -0.05626683])
np.random.randn(1) # array([-1.24528809])

np.random.seed(2)
np.random.randn(1) # array([-0.41675785])
np.random.randn(2) # array([-0.05626683, -1.24528809])

You can notice when I set the same seed, no matter how many random number you request from numpy each time, it always gives the same series of numbers, in this case which is array([-0.41675785, -0.05626683, -1.24528809]).

RobotCharlie
  • 1,180
  • 15
  • 19
5

All the answers above show the implementation of np.random.seed() in code. I'll try my best to explain briefly why it actually happens. Computers are machines that are designed based on predefined algorithms. Any output from a computer is the result of the algorithm implemented on the input. So when we request a computer to generate random numbers, sure they are random but the computer did not just come up with them randomly!

So when we write np.random.seed(any_number_here) the algorithm will output a particular set of numbers that is unique to the argument any_number_here. It's almost like a particular set of random numbers can be obtained if we pass the correct argument. But this will require us to know about how the algorithm works which is quite tedious.

So, for example if I write np.random.seed(10) the particular set of numbers that I obtain will remain the same even if I execute the same line after 10 years unless the algorithm changes.

buhtz
  • 10,774
  • 18
  • 76
  • 149
WadeWilson
  • 51
  • 1
  • 2
3

Imagine you are showing someone how to code something with a bunch of "random" numbers. By using numpy seed they can use the same seed number and get the same set of "random" numbers.

So it's not exactly random because an algorithm spits out the numbers but it looks like a randomly generated bunch.

cjHerold
  • 31
  • 4
2

A random seed specifies the start point when a computer generates a random number sequence.

For example, let’s say you wanted to generate a random number in Excel (Note: Excel sets a limit of 9999 for the seed). If you enter a number into the Random Seed box during the process, you’ll be able to use the same set of random numbers again. If you typed “77” into the box, and typed “77” the next time you run the random number generator, Excel will display that same set of random numbers. If you type “99”, you’ll get an entirely different set of numbers. But if you revert back to a seed of 77, then you’ll get the same set of random numbers you started with.

For example, “take a number x, add 900 +x, then subtract 52.” In order for the process to start, you have to specify a starting number, x (the seed). Let’s take the starting number 77:

Add 900 + 77 = 977 Subtract 52 = 925 Following the same algorithm, the second “random” number would be:

900 + 925 = 1825 Subtract 52 = 1773 This simple example follows a pattern, but the algorithms behind computer number generation are much more complicated

1

There is a nice explanation in Numpy docs: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.RandomState.html it refers to Mersenne Twister pseudo-random number generator. More details on the algorithm here: https://en.wikipedia.org/wiki/Mersenne_Twister

Poe Dator
  • 4,535
  • 2
  • 14
  • 35
1
numpy.random.seed(0)
numpy.random.randint(10, size=5)

This produces the following output: array([5, 0, 3, 3, 7]) Again,if we run the same code we will get the same result.

Now if we change the seed value 0 to 1 or others:

numpy.random.seed(1)
numpy.random.randint(10, size=5)

This produces the following output: array([5 8 9 5 0]) but now the output not the same like above.

Humayun Ahmad Rajib
  • 1,502
  • 1
  • 10
  • 22
0

All the random numbers generated after setting particular seed value are same across all the platforms/systems.

Prashant Abdare
  • 2,175
  • 14
  • 24
0

It makes the random numbers predictable. All of them start with the same combination and every iteration after that will be the same. Example:

Output A: 0, 1, 2
Output B: 1, 3, 5
Output C: 2, 4, 6
Reset seed to 0
Output A: 0, 1, 2
Output B: 1, 3, 5
Output C: 2, 4, 6
Reset seed to 0
Output A: 0, 1, 2
Reset seed to 0
Output A: 0, 1, 2
.
.
.

I hope this helped!

Astro648
  • 1
  • 3