2

I have a vector of probabilities (which of course sum 1):

prob = [0.1, 0.3, 0.4, 0.2]

Now I need to generate a random index for this vector (a number between 0 and 3 both included) but I want that the probability of each index is given by prob

0 will be generated with prob 0.1
1 will be generated with prob 0.3
2 will be generated with prob 0.4
3 will be generated with prob 0.2

I know that I can do this by calculating the cumsum

cumsum = [0.1, 0.4, 0.8, 1.0]

Then generating a random number between 0 and 1:

rand_num = np.random.random()

And finally use np.digitize to check in which bin my random number falls.

idx = np.digitize([rand_num], cumsum)

This works and I'm happy with this, digitize even accepts a list of numbers and classifies them into the bins, so I can create my own function to generate indexes given a probability distribution.

My question is: This is a common problem, so doesn't a function already exist that does this? (And that will be more efficient than doing it myself)

Thanks

1 Answers1

3

You can use random.choices for this from Python 3.6, which includes a weights parameter:

>>> from random import choices
>>> prob = [0.1, 0.3, 0.4, 0.2]
>>> choices(range(len(prob)), weights=prob)
[2]
>>> choices(range(len(prob)), weights=prob)
[3]
>>> choices(range(len(prob)), weights=prob, k=4)
[1, 2, 2, 2]
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119