Update as of numpy v1.17 (mid-2019):
The results should be the same across platforms, but not across numpy version.
np.random.seed
is described as a "convenience, legacy function"; it and the more recent/recommended alternative np.random.default_rng
can no longer be relied on to produce the same result across numpy versions, unless specifically using the legacy/compatibility API provided by np.random.RandomState
. While the RandomState module is guaranteed to provide consistent results, it is not updated with algorithmic (or correctness) improvements and is discouraged for use outside of unit testing and backwards compatibility.
See NEP 0019: Random number generator policy. It's actually a decent read :) The abstract reads:
For the past decade, NumPy has had a strict backwards compatibility policy for the number stream of all of its random number distributions. Unlike other numerical components in numpy, which are usually allowed to return different when results when they are modified if they remain correct, we have obligated the random number distributions to always produce the exact same numbers in every version. The objective of our stream-compatibility guarantee was to provide exact reproducibility for simulations across numpy versions in order to promote reproducible research. However, this policy has made it very difficult to enhance any of the distributions with faster or more accurate algorithms. After a decade of experience and improvements in the surrounding ecosystem of scientific software, we believe that there are now better ways to achieve these objectives. We propose relaxing our strict stream-compatibility policy to remove the obstacles that are in the way of accepting contributions to our random number generation capabilities.
This has been implemented in numpy. As of current writing (numpy version 1.22), numpy.random.default_rng()
constructs a new Generator
with the default BitGenerator
. But in the description of np.random.Generator
, the following guidance is attached:
No Compatibility Guarantee
Generator does not provide a version compatibility guarantee. In particular, as better algorithms evolve the bit stream may change.
Therefore, using np.random.default_rng()
will preserve random numbers for the same versions of numpy across platforms, but not across versions. The best practices for ensuring reproducibility are to preserve your exact environment, e.g. in a docker container. Short of this, storing the results of randomly generated data and using the saved results in downstream workflows can help with reproducibility, though of course this does not save you from API changes later in your workflow the way a docker container would.