0

I have the following list:

series=[0.6, 4.1, 0.6, 6.7, 9.2, 7.6, 5.5, 0.9, 3.8, 8.4]

the mean of series is 4.74 and its np.std equals : 3.101

I want to generate 1000 observations from series so I used the following method:

>>> series_1000=np.random.normal(4.74, 3.101, size=(1000))
>>> series_1000
>>> array([ 3.43395217,  6.60462489,  5.27316166,  4.20429521,  4.76772334,
        8.04441319, -0.6967243 ,  0.53378519,  2.1736758 ,  9.96333279....

Problem

The above method seems to be good, however it works under the assumption that series is normally distributed.

Goal

My goal is to find a way of simulating values without any assumption regarding the original series.

Any help from your side will be highly appreciated.

Khaled DELLAL
  • 871
  • 4
  • 16
  • You can [bootstrap](https://stats.stackexchange.com/questions/383376/sampling-from-empirical-distribution) from the [empirical cdf](https://stackoverflow.com/questions/15792552/numpy-scipy-equivalent-of-r-ecdfxx-function). – Michael Szczesny Nov 19 '22 at 12:48
  • @MichaelSzczesny Can you please add more details about this in an answer ? – Khaled DELLAL Nov 19 '22 at 13:00

1 Answers1

1

If a uniform distribution is better suited for your needs, you can use:

(np.random.uniform(-1, 1, size=1000) * 3.101) + 4.74

Or inside a convenience function:

def generate_values(mean, std, size=1000):
    return(np.random.uniform(-1, 1, size=size) * std) + mean
Edu
  • 76
  • 3
  • Thanks @Edu , I think this works under the assumption that `series` is uniformly distributed, although I search a method without any assumption – Khaled DELLAL Nov 19 '22 at 11:21
  • You cannot sample a number from an undefined distribution. The uniform distribution is however the less biased choice, so I would go for that if you don't know anything about the original distribution – Edu Nov 19 '22 at 11:24