4

I have generated random data which follows normal distribution using the below code:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

rng = np.random.default_rng()
number_of_rows = 10000
mu = 0
sigma = 1
data = rng.normal(loc=mu, scale=sigma, size=number_of_rows)

dist_plot_data = sns.distplot(data, hist=False)
plt.show()

The above code generates the below distribution plot as expected:

enter image description here

If I want to create a distribution plot that is exactly an inverse curve like below then how can I generate the random normal distribution data?

enter image description here

I want the data for which the distribution plot will show the inverse curve. How can I generate this normal distribution data?

BC Smith
  • 727
  • 1
  • 7
  • 19
  • 1
    Does this answer your question? [Inverse normal random number generation in python?](https://stackoverflow.com/questions/62899379/inverse-normal-random-number-generation-in-python) – Peter O. Nov 17 '20 at 16:28

1 Answers1

2

not sure how useful this is, but it's easy to do with rejection sampling. Borrowing the API from Peter O's previous solution but working with blocks for performance gives me:

import numpy as np

def invNormal(low, high, mu=0, sd=1, *, size=1, block_size=1024):
    remain = size
    result = []
    
    mul = -0.5 * sd**-2

    while remain:
        # draw next block of uniform variates within interval
        x = np.random.uniform(low, high, size=min((remain+5)*2, block_size))
        
        # reject proportional to normal density
        x = x[np.exp(mul*(x-mu)**2) < np.random.rand(*x.shape)]
        
        # make sure we don't add too much
        if remain < len(x):
            x = x[:remain]

        result.append(x)
        remain -= len(x)

    return np.concatenate(result)

can be used as sns.histplot(invNormal(-4, 4, size=100_000), bins=51), giving me:

histogram

note that probability densities have to integrate to 1, so the "wider" you make it the smaller the densities will be (i.e. you can't have a density of 0.4 on the y-axis if the range on the x-axis is [-4, +4]). also, it feels less useful to generate a KDE because it'll struggle with the discontinuity at the edges

Sam Mason
  • 15,216
  • 1
  • 41
  • 60
  • if i want to use mean and standard deviation values to generate data how can I do that in your given example? – BC Smith Nov 19 '20 at 14:08
  • 1
    @BCSmith I've updated the code to allow a mean and standard deviation to be specified – Sam Mason Nov 19 '20 at 18:08
  • I want to generalize this method for other distributions like weibul, beta, binomial, chi-square, etc. How can I generalize this or how I will do for those? – BC Smith Nov 20 '20 at 18:53
  • I'd suggest reading up on some statistics... you just plug the density function into the rejection step – Sam Mason Nov 20 '20 at 20:58