I am trying to create a function that creates fake data to use in a separate analysis. Here are the requirements for the function.
Problem 1
In this problem you will create fake data using numpy. In the cell below the function create_data takes in 2 parameters "n" and "rand_gen.
- The "rand_gen" parameter is a pseudo-random number generator. We are using a pseudo-random number generator to produce the same results.
- Use the numpy.random.randn function of the pseudo-random generator to create a numpy array of length n and return the array.
Here is the function I have created.
def create_data(n, rand_gen):
'''
Creates a numpy array with n samples from the standard normal distribution
Parameters
-----------
n : integer for the number of samples to create
rand_gen : pseudo-random number generator from numpy
Returns
-------
numpy array from the standard normal distribution of size n
'''
numpy_array = np.random.randn(n)
return numpy_array
Here is the first test I run on my function.
create_data(10, np.random.RandomState(seed=23))
I need the output to be this exact array.
[0.66698806, 0.02581308, -0.77761941, 0.94863382, 0.70167179,
-1.05108156, -0.36754812, -1.13745969, -1.32214752, 1.77225828]
My output is still completely random and I do not fully understand what the RandomState call is trying to do with the seed to create the above array rather than have it be completely random. I know I need to use the rand_gen variable in my function, but I do not know where and I think it's because I just don't understand what it is trying to do.