how to simulate a variable with fixed interval?

Question

I am trying to simulate the performance of a real life process. The variables that have been measured historically shows a fixed interval, so been lower o greater that those values is physically impossible.

To simulate the process output, each input variable historical data was represented as the best fit probability distribution, respectively (using this approach: Fitting empirical distribution to theoretical ones with Scipy (Python)?).

However, the resulting theoretical distribution when is simulated n-times do not represent the real life expected min and maximum values. I am thinking to apply a try-except test each simulation to check if each simulated value is between the expected interval, but I am not sure if this is the best way to handle this due to, experimental mean and variance is not achieved.

Please show actual code for what you're currently doing, I'm finding your verbal explanation to be unclear. — pjs, Feb 24 '19 at 17:28

a_guest · Accepted Answer · 2019-02-25T10:34:55.590

You can use a boolean mask in numpy for regenerating the values that are outside the required boundaries. For example:

def random_with_bounds(func, size, bounds):
    x = func(size=size)
    r = (x < bounds[0]) | (x > bounds[1])
    while r.any():
        x[r] = func(size=r.sum())
        r[r] = (x[r] < bounds[0]) | (x[r] > bounds[1])
    return x

Then you can use it like:

random_with_bounds(np.random.normal, 1000, (-1, 1))

Another version using index arrays via np.argwhere gives slightly increased performance:

def random_with_bounds_2(func, size, bounds):
    x = func(size=size)
    r = np.argwhere((x < bounds[0]) | (x > bounds[1])).ravel()
    while r.size > 0:
        x[r] = func(size=r.size)
        r = r[(x[r] < bounds[0]) | (x[r] > bounds[1])]
    return x

how to simulate a variable with fixed interval?

1 Answers1