9

I wonder if it is possible to exactly reproduce the whole sequence of randn() of MATLAB with NumPy. I coded my own routine with Python/Numpy, and it is giving me a little bit different results from the MATLAB code somebody else did, and I am having hard time finding out where it is coming from because of different random draws.

I have found the numpy.random.seed value which produces the same number for the first draw, but from the second draw and on, it is completely different. I'm making multivariate normal draws for about 20,000 times so I don't want to just save the matlab draws and read it in Python.

joon
  • 3,899
  • 1
  • 40
  • 53
  • 1
    What's wrong with `numpy.random.randn(...)`? It should do exactly what you need, unless you're worried about generating _exactly_ the same sequence of numbers with a given seed... (Nevermind, obviously that's what you're trying to do, now that I've re-read the question. Out of vague curiosity, why do you need the exact sequence to be the same?) – Joe Kington Sep 15 '10 at 21:56
  • What do you mean "MATLAB code somebody else did"? `randn` is a standard MATLAB function. I sincerely hope that you aren't using some "custom" function off the trash heap that is FEX. – Nick T Sep 15 '10 at 21:58
  • 2
    Nothing wrong with it, I need to generate exactly the same sequence of numbers with a given seed. As I said, my routine with NumPy is giving me different result from a MATLAB code and I cannot debug it because of the difference in random numbers. I cannot identify where it went wrong. – joon Sep 15 '10 at 21:59
  • @Nict T of course not. I'm running some Bayesian model and my code is giving me different estimates from the MATLAB code. I'm pretty sure the MATLAB code is correct, so I'm trying to debug my code, but because of different random draws I cannot identify where the calculations differ. – joon Sep 15 '10 at 22:01
  • 4
    Still not answering your question, but saving/loading isn't *that* bad. SciPy has a MATLAB `.mat` file translator, so you could dump your MATLAB workspace to a file and bring it in fairly easily with SciPy.io.mio http://www.scipy.org/doc/api_docs/SciPy.io.mio.html – Nick T Sep 15 '10 at 22:04
  • Yes I guess I have to do that. For whole result I need about 10,000,000 draws, but I guess for the test I can just save small amount of draws and will be able to see where the calculation differs. Thanks! – joon Sep 15 '10 at 22:08
  • Happy to report that I was successful to reproduce the results. Ended up saving/loading 100 mil draws. Thanks. – joon Sep 16 '10 at 19:34

3 Answers3

8

The user asked if it was possible to reproduce the output of randn() of Matlab, not rand. I have not been able to set the algorithm or seed to reproduce the exact number for randn(), but the solution below works for me.

In Matlab: Generate your normal distributed random numbers as follows:

rng(1);
norminv(rand(1,5),0,1)
ans = 
   -0.2095    0.5838   -3.6849   -0.5177   -1.0504

In Python: Generate your normal distributed random numbers as follows:

import numpy as np
from scipy.stats import norm
np.random.seed(1)
norm.ppf(np.random.rand(1,5))
array([[-0.2095,  0.5838, -3.6849, -0.5177,-1.0504]])

It is quite convenient to have functions, which can reproduce equal random numbers, when moving from Matlab to Python or vice versa.

Jens Munk
  • 4,627
  • 1
  • 25
  • 40
5

If you set the random number generator to the same seed, it will theoretically create the same numbers, ie in matlab. I am not quite sure how to best do it, but this seems to work, in matlab do:

rand('twister', 5489)

and corresponding in numy:

np.random.seed(5489)

To (re)initalize your random number generators. This gives for me the same numbers for rand() and np.random.random(), however not for randn, I am not sure if there is an easy method for that.

With newer matlab versions you can probably set up a RandStream with the same properties as numpy, for older you can reproduce numpy's randn in matlab (or vice versa). Numpy uses the polar form to create the uniform numbers from np.random.random() (the second algorithm given here: http://www.taygeta.com/random/gaussian.html). You could just write that algorithm in matlab to create the same randn numbers as numpy does from the rand function in matlab.

If you don't need a huge amount of random numbers, just save them in a .mat and read them from scipy.io though...

seberg
  • 8,785
  • 2
  • 31
  • 30
4

Just wanted to further clarify on using the twister/seeding method: MATLAB and numpy generate the same sequence using this seeding but will fill them out in matrices differently.

MATLAB fills out a matrix down columns, while python goes down rows. So in order to get the same matrices in both, you have to transpose:

MATLAB:

rand('twister', 1337);
A = rand(3,5)
A = 
 Columns 1 through 2
   0.262024675015582   0.459316887214567
   0.158683972154466   0.321000540520167
   0.278126519494360   0.518392820597537
  Columns 3 through 4
   0.261942925565145   0.115274226683149
   0.976085284877434   0.386275068634359
   0.732814552690482   0.628501179539712
  Column 5
   0.125057926335599
   0.983548605143641
   0.443224868645128

python:

import numpy as np
np.random.seed(1337)
A = np.random.random((5,3))
A.T
array([[ 0.26202468,  0.45931689,  0.26194293,  0.11527423,  0.12505793],
       [ 0.15868397,  0.32100054,  0.97608528,  0.38627507,  0.98354861],
       [ 0.27812652,  0.51839282,  0.73281455,  0.62850118,  0.44322487]])

Note: I also placed this answer on this similar question: Comparing Matlab and Numpy code that uses random number generation

Community
  • 1
  • 1
Diego
  • 16,830
  • 8
  • 34
  • 46