18

Python, NumPy and R all use the same algorithm (Mersenne Twister) for generating random number sequences. Thus, theoretically speaking, setting the same seed should result in same random number sequences in all 3. This is not the case. I think the 3 implementations use different parameters causing this behavior.

R
>set.seed(1)
>runif(5)
[1] 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819
Python
In [3]: random.seed(1)

In [4]: [random.random() for x in range(5)]
Out[4]: 
[0.13436424411240122,
 0.8474337369372327,
 0.763774618976614,
 0.2550690257394217,
 0.49543508709194095]

NumPy
In [23]: import numpy as np

In [24]: np.random.seed(1)
In [25]: np.random.rand(5)
Out[25]: 
array([  4.17022005e-01,   7.20324493e-01,   1.14374817e-04,
         3.02332573e-01,   1.46755891e-01])

Is there some way, where NumPy and Python implementation could produce the same random number sequence? Ofcourse as some comments and answers point out, one could use rpy. What I am specifically looking for is to fine tune the parameters in the respective calls in Python and NumPy to get the sequence.

Context: The concern comes from an EDX course offering in which R is used. In one of the forums, it was asked if Python could be used and the staff replied that some assignments would require setting specific seeds and submitting answers.

Related:

  1. Comparing Matlab and Numpy code that uses random number generation From this it seems that the underlying NumPy and Matlab implementation are similar.
  2. python vs octave random generator: This question does come fairly close to the intended answer. Some sort of wrapper around the default state generator is required.
Community
  • 1
  • 1
Nipun Batra
  • 11,007
  • 11
  • 52
  • 77
  • You could use RPy (or something similar) to get the random numbers from R directly. http://rpy.sourceforge.net/ – tom10 Mar 06 '14 at 01:59
  • @tom10 That is possible. But, there should be a cleaner way of doing so, I guess! – Nipun Batra Mar 06 '14 at 02:03
  • Just from having a quick poke at things like `.Random.seed` in R and `np.random.get_state()` in `numpy`, it looks like `numpy` uses unsigned ints whereas R uses signed ones for the RNG's state- I'm not sure if you can really force them to behave the same. – Marius Mar 06 '14 at 02:14
  • [query the web](http://www.random.org/clients/http/), e.g., `scan("http://www.random.org/integers/?num=10&min=1&max=6&col=1&base=10&format=plain&rnd=id.xyz") ` – Martin Morgan Mar 06 '14 at 02:34
  • @tom10: I was looking forward to fine tune the parameters in the functions to achieve this. Not sure if that is possible. – Nipun Batra Mar 06 '14 at 02:50
  • @NipunBatra: That's a good and interesting question, and you do specifically ask it. I think, though, that you'd get more people answering it if you emphasized that over wanting something so you can do your coursework. – tom10 Mar 06 '14 at 03:20
  • @MartinMorgan: There is [a CRAN package for that](http://dirk.eddelbuettel.com/code/random.html). – Dirk Eddelbuettel Mar 06 '14 at 03:21
  • 5
    @NipunBatra: I think you underestimate how involved the state keeping of the default number generators in R is (as R really has a collection of RNGs between which you can switch). The only real to have the same stream across different languages is to reimplement a common generator. Paul Gilbert did that 15 years ago generate identical simulation in R and S-Plus (a cousin of R and a different, earlier dialect of S). – Dirk Eddelbuettel Mar 06 '14 at 03:25
  • If someone wanted to badly enough it should be possible to write a Python wrapper that calls the relevant functions in the standalone `libRmath` library. @GJJ's answer below would be lighter-weight though. – Ben Bolker May 16 '17 at 01:23

2 Answers2

10

use rpy2 to call r in python, here is a demo, the numpy array data is sharing memory with x in R:

import rpy2.robjects as robjects

data = robjects.r("""
set.seed(1)
x <- runif(5)
""")

print np.array(data)

data[1] = 1.0

print robjects.r["x"]
HYRY
  • 94,853
  • 25
  • 187
  • 187
  • Thanks. This looks easy. I would still prefer trying to avoid rpy2. The reason being that same algorithm is being used. – Nipun Batra Mar 06 '14 at 02:09
10

I realize this is an old question, but I've stumbled upon the same problem recently, and created a solution which can be useful to others.

I've written a random number generator in C, and linked it to both R and Python. This way, the random numbers are guaranteed to be the same in both languages since they are generated using the same C code.

The program is called SyncRNG and can be found here: https://github.com/GjjvdBurg/SyncRNG.

GjjvdBurg
  • 478
  • 3
  • 14