how to keep the behavior of random::seed in NumCpp consistent with numpy?

Question

I tried to migrate a piece of code from numpy to NumCpp, but the random::seed behavior of NumCpp is inconsistent with that of numpy.

I want to know how to keep the behavior of random::seed in NumCpp consistent with numpy?

Here is my Python code:

import numpy as np
np.random.seed(0)
print(np.random.randn(1, 4))

The output of this code is:

[[1.76405235 0.40015721 0.97873798 2.2408932 ]]

And here is my C++ code:

#include <iostream>
#include "NumCpp.hpp"

using namespace std;
int main(void) {
    nc::random::seed(0);
    cout << nc::random::randN<float>(nc::Shape(1, 4)) << endl;
    return 0;
}

The output is:

[[0.101919, -0.481323, -0.680603, 0.0649879, ]]

NumCpp is a distinct implementation from NumPy, so it's normal for them to use different pseudo-random generators. Is there a reason why you want them to produce exactly same random numbers? — kotatsuyaki, Nov 29 '22 at 12:20
There are several valid techniques for generating normals, which can use different quantities of underlying uniforms. The base uniform generator may also be different. If the two libraries use different algorithms, you won’t be able to sync them. — pjs, Nov 29 '22 at 12:22
Both numpy and NumCpp use the Mersenne Twister pseudo-random number generator(MT19937), so I wonder if it is possible to make the behavior consistent with that of numpy? numpy: https://github.com/numpy/numpy/blob/8d61ebc25a117337d148f1e3d96066653bd6419a/numpy/random/_mt19937.pyx#L43 NumCpp: https://github.com/dpilger26/NumCpp/blob/master/include/NumCpp/Random/RNG.hpp#L72 — barriery, Nov 29 '22 at 12:26
Even if both use Mersenne Twister, the expansion of seed to state is not standardized across platforms. On top of that, if they use two entirely different algorithms for normals such as Box-Muller (which uses two uniforms) vs Ziggurat (which uses a variable number of uniforms), they won’t ever produce the same sequence of values. — pjs, Nov 29 '22 at 12:45
Even if the Twister sequences are the same, the number of calls used to create a specific final number is likely to differ. — hpaulj, Nov 29 '22 at 15:48
You might be interested in [this SO post](https://stackoverflow.com/q/72817047/2166798). It shows that numpy itself doesn’t always yield identical results on different systems due to underlying math library implementations. — pjs, Nov 29 '22 at 15:58
@barriery it might be worth suggesting to the authors of NumCpp they update their docs to mention this difference. maintaining backwards compatibility in software is awkward, e.g. you're using NumPy's "legacy" random interface, but there's also a [newer generator style](https://numpy.org/doc/stable/reference/random/generator.html) random interface that produces a different sequence from the same seed as well as having different (hopefully better) performance characteristics — Sam Mason, Nov 29 '22 at 16:44
@barriery I’d suggest by starting small with testing. Generate integer values using MT seeded identically. If the results are the same, then you know that both MT implementations expand the seed value in the same way to populate the 19937 bit state space. If not, check whether they both offer getstate/setstate capabilities and use that instead of `seed()`. Finally, if you can get the same integer values from MT, but get different normals, you’ll know they’re using different algorithms or have different floating point libraries. — pjs, Nov 30 '22 at 23:15

how to keep the behavior of random::seed in NumCpp consistent with numpy?

0 Answers0