16

I want to create a normal distributed array with numpy.random.normal that only consists of positive values. For example the following illustrates that it sometimes gives back negative values and sometimes positive. How can I modify it so it will only gives back positive values?

>>> import numpy
>>> numpy.random.normal(10,8,3)
array([ -4.98781629,  20.12995344,   4.7284051 ])
>>> numpy.random.normal(10,8,3)
array([ 17.71918829,  15.97617052,   1.2328115 ])
>>> 

I guess I could solve it somehow like this:

myList = numpy.random.normal(10,8,3)

while item in myList <0:
       # run again until all items are positive values
       myList = numpy.random.normal(10,8,3)
ustroetz
  • 5,802
  • 16
  • 47
  • 74
  • What do you mean by 'only give back positive values'? What do you want it to do if it would return a negative value? – Patashu May 01 '13 at 03:15
  • Well I would like to modify the code so it will only give back positive values. – ustroetz May 01 '13 at 03:16
  • 1
    By definition, a normal distribution extends over all possible values, positive and negative. You cannot reconcile 'normal distribution' with 'only positive values', so my question to you is... what do you REALLY want? – Patashu May 01 '13 at 03:17
  • 1
    I need normal distributed values that I feed into a function. The function does only take positive values. – ustroetz May 01 '13 at 03:21
  • 2
    Normal distributions extend over all possible values, positive and negative. If you prevent it from returning negative values it is by definition no longer a normal distribution. So whatever distribution you feed to your function by definition cannot be negative. With the above in mind, what distribution do you want? – Patashu May 01 '13 at 03:25
  • Okay thanks for letting me know. Than I don't know and I have to look more into statistics before i proceed. – ustroetz May 01 '13 at 03:28
  • 1
    The binomial distribution is similar to normal distribution, but discrete, and ranges only over positive values: http://en.wikipedia.org/wiki/Binomial_distribution – Patashu May 01 '13 at 03:29
  • see also http://stackoverflow.com/questions/18441779/how-to-specify-upper-and-lower-limits-when-using-numpy-random-normal – Sparkler Feb 20 '17 at 19:52

9 Answers9

12

The normal distribution, by definition, extends from -inf to +inf so what you are asking for doesn't make sense mathematically.

You can take a normal distribution and take the absolute value to "clip" to positive values, or just discard negative values, but you should understand that it will no longer be a normal distribution.

wim
  • 338,267
  • 99
  • 616
  • 750
  • 1
    "The normal distribution, by definition, extends from -inf to +inf ... " that doesn't mean you can not numerically obtain clipped/off-centered/x-shifted distribution. This scenario can arise in typical spectral detectors. After every number drawn with np.random.normal(fix_mean,fix_sigma), you can still check non-negativity condition and draw a new instead. – jaydeepsb Feb 26 '19 at 10:03
  • 1
    @jaydeepsb ...which is exactly what I mentioned in the answer? – wim Feb 26 '19 at 13:52
  • 2
    The question matches a case in which you want to keep only positive values from a normal distribution but want to define the size of the array in the definition. The question DOES make sense and is valid. – aerijman Mar 31 '19 at 13:52
5

I assume that what you mean is that you want to modify the probability density such that it is the same shape as normal in the positive range, and zero in negative. That is a pretty common practical case. In such case, you cannot simply take the absolute value of generated normal random variables. Instead, you have to generate a new independent normally distributed number until you come up with a positive one. One way to do that is recursively, see below.

import numpy as np def PosNormal(mean, sigma): x = np.random.normal(xbar,delta_xbar,1) return(x if x>=0 else PosNormal(mean,sigma))

smci
  • 32,567
  • 20
  • 113
  • 146
Gena Kukartsev
  • 1,515
  • 2
  • 17
  • 19
  • This is probably reasonable for some use cases, but note that sampling from this distribution will be biased towards higher values, especially if the mean is near zero. Also you might get a stack overflow if you're particularly unlucky. – vroomfondel Sep 12 '16 at 22:47
  • Yes, this is not a normal distribution anymore, we changed it, it has a different PDF, truncated at zero. But I think that that was the question, and it is often practical to do so. Concerning your worry about stack overflow in this case, noone is THAT unlucky unless you only have a few bytes of RAM. – Gena Kukartsev Sep 28 '16 at 16:32
  • Concerning your worry about stack overflow in this case, you are right if one is interested in a several-sigma right tail of the distribution only, i.e. if the mean of the distribution is negative by a few sigmas. In that case, the performance of this solution will deteriorate and will eventually overflow. But I doubt that this is the intended use case. It is more of a quick hack. – Gena Kukartsev Sep 28 '16 at 16:39
  • Seems really inefficient since you could also use `abs()` – user1406177 Feb 13 '17 at 09:12
  • No, using abs() would change the shape of the distribution. The difference is between cutting off negative tail and "flipping" it and adding to the rest of the distribution. This would disproportionally inflate probability density near zero. You can argue that it satisfies the original question: it will produce positive-only numbers but from a different and rather weird distribution. – Gena Kukartsev Feb 20 '17 at 18:35
  • 1
    You don't need a recursive function to achieve the same result ```while x<=0: x = np.random.normal(mean, sigma)``` – Corvince Feb 15 '19 at 12:35
1

what about using lognormal along these lines:

    mu = np.mean(np.log(list))
    sigma = np.std(np.log(list))

    new_list = np.random.lognormal(mu, sigma, length_of_new_list)
leoneckert
  • 299
  • 2
  • 2
1

data = np.random.randint(low=1,high=100,size=(4,4),dtype='int')

Mahesh Sonavane
  • 187
  • 1
  • 3
  • This was exactly what I was looking for. Thank you! – Tensigh Aug 29 '18 at 05:09
  • 2
    Your solution is wrong if the question is about obtaining non-negative normal distributed numbers. Since, np.random.randint draws random numbers from uniform distribution, you need to still use np.random.normal. As suggested by ustroetz, the workaround is to keep drawing new numbers (with same mean, sigma) until it is non-negative, and then include it in your array. – jaydeepsb Feb 26 '19 at 09:56
1

Or maybe you could just 'shift' your entire distribution to the 'right' by subtracting the min (or adding the abs val of your min):

y = np.random.normal(0.0, 1.0, 10)

y
array([-0.16934484,  0.06163384, -0.29714508, -0.25917105, -0.0395456 ,
        0.17424635, -0.42289079,  0.71837785,  0.93113373,  1.12096384])

y - min(y)
array([0.25354595, 0.48452463, 0.12574571, 0.16371974, 0.38334519,
       0.59713714, 0.        , 1.14126864, 1.35402452, 1.54385463])
nfreundl
  • 11
  • 1
1

The question is reasonable. For motivation, consider simulations of biological cells. The distribution of the count of a type of molecule in a cell can be approximated by the normal distribution, but must be non-negative to be physically meaningful.

My whole-simulator uses this method to sample the initial distribution of a molecule's count:

def non_neg_normal_sample(random_state, mean, std, max_iters=1000):
    """ Obtain a non-negative sample from a normal distribution

    The distribution returned is normal for 0 <= x, and 0 for x < 0

    Args:
        random_state (:obj:`numpy.random.RandomState`): a random state
        mean (:obj:`float`): mean of the normal dist. to sample
        std (:obj:`float`): std of the normal dist. to sample
        max_iters (:obj:`int`, optional): maximum number of draws of the true normal distribution

    Returns:
        :obj:`float`: a normal sample that is not negative

    Raises:
        :obj:`ValueError`: if taking `max_iters` normal sample does not obtain one that is not negative
    """
    iter = 0
    while True:
        sample = random_state.normal(mean, std)
        iter += 1
        if 0 <= sample:
            return sample
        if max_iters <= iter:
            raise ValueError(f"{iter} draws of a normal dist. with mean {mean:.2E} and std {std:.2E} "
                             f"fails to obtain a non-negative sample")

I expand on @gena-kukartsev 's answer in two ways: First, I avoid recursion which could overflow the call stack. (Let's avoid answers that can overflow the stack on stackoverflow!) Second, I catch possibly bad input by limiting the number of samples of the distribution.

Arthur
  • 525
  • 7
  • 18
  • My answer will be inefficient, of course, if P[0 <= sample] is very low. The normal distribution could be used to reject calls to `non_neg_normal_sample` which are highly likely to fail. But the structure of `non_neg_normal_sample` will work for any distribution that includes negative and positive values. – Arthur Dec 11 '20 at 19:38
0

You can offset your entire array by the lowest value (left most) of the array. What you get may not be truly "normal distribution", but within the scope of your work, dealing with finite array, you can ensure that the values are positive and fits under a bell curve.

>>> mu,sigma = (0,1.0)
>>> s = np.random.normal(mu, 1.0, 100)
>>> s
array([-0.58017653,  0.50991809, -1.13431539, -2.34436721, -1.20175652,
        0.56225648,  0.66032708, -0.98493441,  2.72538462, -1.28928887])
>>> np.min(s)
-2.3443672118476226
>>> abs(np.min(s))
2.3443672118476226
>>> np.add(s,abs(np.min(s)))
array([ 1.76419069,  2.85428531,  1.21005182,  0.        ,  1.14261069,
        2.90662369,  3.00469429,  1.3594328 ,  5.06975183,  1.05507835])
Benny
  • 639
  • 3
  • 11
  • 25
0

You could use high loc with low scale:

np.random.normal(100, 10, 10) /100

[0.96568643 0.92123722 0.83242272 0.82323367 1.07532713 0.90125736
 0.91226052 0.90631754 1.08473303 0.94115643]
sntrcode
  • 202
  • 1
  • 7
  • 10
0
arr=np.random.normal(0,1,10)
arr[gdp_cap<0]=-arr[gdp_cap<0] #Just invert the elements less than 0
print(gdp_cap)
  • 2
    Answer needs supporting information Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](https://stackoverflow.com/help/how-to-answer). – moken Jul 31 '23 at 05:34
  • where did you get gdp_cap? – soggypants Aug 29 '23 at 18:11