3

I am trying to reproduce the C++ code into Python 3.6, but the sequence of pseudo random numbers is different in each implementation. The seed are the same on both implementation and as far as I know, both use Mersenne Twister algorithm.

What am I doing wrong?

REMEMBER1: Both codes uses the SAME seed

REMEMBER2: As far as I know, both code uses functions that implemente the SAME algorithm (Mersenne Twister).

C++:

#include <random>
#include <iostream>
int main(int argc, char* argv[])
{  
    std::mt19937 gen(2);
    std::uniform_int_distribution<> dis(0, 61);

    for (int n=0; n<10; ++n)
        std::cout << dis(gen) << ' ';

    return 0;
}

Python 3.6:

import numpy as np
rng = np.random.RandomState(2)
for i in range(10):
    print(str(rng.randint(0, 62)))

Note: randint has an exclusive upper bound. That is why I use 61 on C++ code, but 62 on Python code.

Carlos Ost
  • 492
  • 7
  • 22
  • 2
    *but the sequence of pseudo random numbers is different in each implementation* -- Isn't that a feature and not a bug? – PaulMcKenzie Jul 22 '19 at 22:01
  • As far as I know, both implementation uses the same algorithm (Mersenne Twister). That way, the results should be the same, once the seed is the same. – Carlos Ost Jul 22 '19 at 22:05
  • @thc really???? This is the sequence I got only on Python. I don't understand, but I will triple check :-) – Carlos Ost Jul 22 '19 at 22:11
  • Yes, I used: `g++ temp.cpp -o temp` then `./temp` output `40 15 45 8 22 43 18 11 40 7`. Compiler is clang on OS X. – thc Jul 22 '19 at 22:16
  • Mine is g++ (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0 and I really can't figure out why it would give different results anyway, but thank you again. – Carlos Ost Jul 22 '19 at 22:20
  • The seeding algorithms may be different (integer value to internal state). The original seeding algorithm used by Matsumoto and Nishimura is now considered suboptimal. – user515430 Jul 22 '19 at 23:57

2 Answers2

4

You should note that C++'s standard library distributions, including std::uniform_int_distribution, use implementation-defined algorithms. In other words, these implementations may change depending on which C++ library implementation you choose, and those libraries may change those algorithms in the future. (This is in contrast to C++'s random engine classes, such as std::mt19937, which do guarantee returning the same pseudorandom values from the same seed.) See also this answer.

Your best course of action is to implement or find a stable implementation of an RNG algorithm (such as an algorithm I describe in my article) and implement methods to transform the random numbers they deliver. (There are certain things to keep in mind when choosing an RNG for a particular application; the first article I linked here has more information.)

Peter O.
  • 32,158
  • 14
  • 82
  • 96
  • 2
    and you didn't understood his answer. Basically he says that even though you have used `mt19937` which is well defined the `std::uniform_int_distribution` is not strictly defined, so even in C++ form compiler to compiler you can have different result. – Marek R Jul 22 '19 at 22:39
  • 1
    @carlosOst: You are using not only `std::mt19937`, but also `std::uniform_int_distribution`, which is implementation-defined -- it can vary from implementation to implementation of the C++ standard library. – Peter O. Jul 22 '19 at 22:43
  • 1
    Pretty sure C++ and Python also use different rules for initializing the Mersenne Twister state (which is huge); you may pass `2` as the seed in both cases, but it'll be expanded to completely different internal state, so even without `std::uniform_int_distribution` interfering, the behavior would differ. Not sure if `numpy`'s version is more similar to C++, but [Python's built-in seed expansion algorithm is pretty different from what I can tell](https://github.com/python/cpython/blob/3.7/Modules/_randommodule.c#L258). – ShadowRanger Jul 23 '19 at 02:48
1

There isn't one unique way of getting from a RNG to a single bounded int. See for example:

http://www.pcg-random.org/posts/bounded-rands.html

Which has several versions. Note that C++ and Python take different options here, hence you'll get a different sequence from the "same" RNG and seed.

Sam Mason
  • 15,216
  • 1
  • 41
  • 60