8

More specifically, numpy:

In [24]: a=np.random.RandomState(4)
In [25]: a.rand()
Out[25]: 0.9670298390136767
In [26]: a.get_state()
Out[26]: 
('MT19937',
 array([1248735455, ..., 1532921051], dtype=uint32),
 2,0,0.0)

octave:

octave:17> rand('state',4)
octave:18> rand()
ans =  0.23605
octave:19> rand('seed',4)
octave:20> rand()
ans =  0.12852

Octave claims to perform the same algorithm (Mersenne Twister with a period of 2^{19937-1})

Anybody know why the difference?

khagler
  • 3,996
  • 29
  • 40
phil0stine
  • 303
  • 1
  • 13

3 Answers3

10

Unfortunately the MT19937 generator in Octave does not allow you to initialise it using a single 32-bit integer as np.random.RandomState(4) does. If you use rand("seed",4) this actually switches to an earlier version of the PRNG used previously in Octave, which PRNG is not MT19937 at all, but rather the Fortran RANDLIB.

It is possible to get the same numbers in NumPy and Octave, but you have to hack around the random seed generation algorithm in Octave and write your own function to construct the state vector out of the initial 32-bit integer seed. I am not an Octave guru, but with several Internet searches on bit manipulation functions and integer classes in Octave/Matlab I was able to write the following crude script to implement the seeding:

function state = mtstate(seed)

state = uint32(zeros(625,1));

state(1) = uint32(seed);
for i=1:623,
   tmp = uint64(1812433253)*uint64(bitxor(state(i),bitshift(state(i),-30)))+i;
   state(i+1) = uint32(bitand(tmp,uint64(intmax('uint32'))));
end
state(625) = 1;

Use it like this:

octave:9> rand('state',mtstate(4));
octave:10> rand(1,5)
ans =

   0.96703   0.54723   0.97268   0.71482   0.69773

Just for comparison with NumPy:

>>> a = numpy.random.RandomState(4)
>>> a.rand(5)
array([ 0.96702984,  0.54723225,  0.97268436,  0.71481599,  0.69772882])

The numbers (or at least the first five of them) match.

Note that the default random number generator in Python, provided by the random module, is also MT19937, but it uses a different seeding algorithm, so random.seed(4) produces a completely different state vector and hence the PRN sequence is then different.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
2

If you look at the details of the Mersenne Twister algorithm, there are lots of parameters that affect the actual numbers produced. I don't think Python and Octave are trying to produce the same sequence of numbers.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • fair enough. It would be nice to set, or at least get, the default parameters. I am not so optimistic about that happening with octave. Thanks – phil0stine Dec 06 '12 at 01:22
  • A larger question is: why do you want them to produce the same sequence? – Ned Batchelder Dec 06 '12 at 01:24
  • Yes exactly @mmgp. It's a very nice construct to be able to mimic a random sequence but control it during debug to compare results. – phil0stine Dec 06 '12 at 01:44
  • This isn't quite right. I'm virtually certain that the algorithm per se is identical in both cases. Everyone just takes the official code for MT19937 and uses it. What probably differs is how each implementation takes a seed integer and expands that to an initial 624-word state. If you replicated that initial 624-word state, I'm fairly certain that you would get identical numbers coming out. Providing that the [0..1] float normalization step is the same, of course. – Robert Kern Dec 06 '12 at 11:08
  • @RobertKern: Indeed, the seed->initial state mapping is probably the "detail of the algorithm" that is different. – Ned Batchelder Dec 06 '12 at 18:58
  • The seeding routine is fixed in the MT19937 algorithm. The problem here is that Octave does not allow seeding of its MT generator with anything different from the full state vector so one really has to replicate the seeding routine. `rand('seed',x)` switches to a completely different PRNG. – Hristo Iliev Dec 14 '12 at 10:22
-1

It looks like numpy is returning the raw random integers, whereas octave is normalising them to floats between 0 and 1.0.

DaveP
  • 6,952
  • 1
  • 24
  • 37
  • Nope, in the first snippet `a.rand()` returns a `float` (which [*is* normalized between 0 and 1.0](http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.rand.html#numpy.random.rand)). Those integers you see are part of the internal state of the RNG. – Matteo Italia Dec 06 '12 at 01:12