1

I have two gaussian distributions (I'm using multivariate_normal) and I'd like to draw from them with probability of p for the first gaussian and 1-p for the other one. I'd like to make n draws.

Is it possible to do that without a for loop? (for efficiency purposes)

Thanks

IsaacLevon
  • 2,260
  • 4
  • 41
  • 83

1 Answers1

1

Yes, it is possible to perform this operation without a loop. Try:

import numpy as np
from scipy import stats

sample_size = 100
p = 0.25

# Flip a coin with P(HEADS) = p to determine which distribution to draw from
indicators = stats.bernoulli.rvs(p, size=sample_size)

# Draw from N(0, 1) w/ probability p and N(-1, 1) w/ probability (1-p)
draws = (indicators == 1) * np.random.normal(0, 1, size=sample_size) + \
    (indicators == 0) * np.random.normal(-1, 1, size=sample_size)

You can accomplish the same thing using np.vectorize (caveat emptor):

def draw(x):
  if x == 0:
     return np.random.normal(-1, 1)
  elif x == 1:
     return np.random.normal(0, 1)

draw_vec = np.vectorize(draw)    
draws = draw_vec(indicators)

If you need to extend the solution to a mixture of more than 2 distributions, you can use np.random.multinomial to assign samples to distributions and add additional cases to the if/else in draw.

Bill DeRose
  • 2,330
  • 3
  • 25
  • 36
  • but there's a slight problem, is it also possible to label each sample accordingly? – IsaacLevon Dec 26 '18 at 22:51
  • 1
    The labels corresponding to which distribution a sample is drawn from are the entries in the `indicators` array. More explicitly put, you might try: `draws_with_labels = zip(draws, indicators)`. – Bill DeRose Dec 26 '18 at 23:03
  • 1
    Did you notice that `draws` contains just two distinct values? I don't think that is the desired behavior. – Warren Weckesser Dec 26 '18 at 23:34
  • @WarrenWeckesser, I've fixed the issue you spotted and included another example which accomplishes the desired task. – Bill DeRose Dec 27 '18 at 00:18