randomly choose pair (i, j) with probability P[i, j] given stochastic matrix P

Question

I have numpy two dimension array P such that P[i, j] >= 0 and all P[i, j] sums to one. How to choose pair of indexes (i, j) with probability P[i, j] ?

EDIT: I am interested in numpy build function. Is there something for this problem? May be for one dimensional array?

The easy way - pick a random number q [0,1]. Pick a random point in your matrix (list), if q < P[i,j] keep. If not, try again. The hard(er) way, build a cumlatative distribution function for your discrete prob. set. — Hooked, Oct 10 '12 at 00:58
For a general algorithm (not necessarily numpy), see [this excellent article](http://www.keithschwarz.com/darts-dice-coins/), based on [this question](http://stackoverflow.com/questions/5027757/data-structure-for-loaded-dice). It lists a number of algorithms, including one that can pick a random element in O(1), with O(nlogn) initial setup. This question can probably be closed as a duplicate, unless numpy/scipy has a builtin method that would help. — interjay, Oct 10 '12 at 01:20

score 1 · Answer 1 · answered Oct 10 '12 at 01:20

Here's a simple algorithm in python that does what you are expecting.

Let's take for example a single dimension array P equal to [0.1,0.3,0.4,0.2]. The logic can be extended to any number of dimensions.

Now we set each element to the sum of all the elements that precede it: P => [0, 0.1, 0.4, 0.8, 1]

Using a random generator, we generate numbers that are between 0 and 1. Let's say x = 0.2. Using a simple binary search, we can determine that x is between the first element and the second element. We just pick the first element for this value of x.

If you look closely, the chance that 0 =< X < 0.1 is 0.1. The chance that 0.1 =< x < 0.4 is 0.3 and so on.

For the 2D array, it is better to convert it to a 1D array, even though, you should be able to implement a 2D array binary search algorithm.

nneonneo · Answer 2 · 2012-10-11T12:09:38.950

1

# setup
import bisect
import numpy as np
cs = P.cumsum()

# get random value
r = np.random.uniform()
i, j = divmod(cs.searchsorted(r), P.shape[1])

O(n) initial setup, O(log n) selection.

edited Oct 11 '12 at 12:09

answered Oct 10 '12 at 06:19

nneonneo

171,345
36
312
383

score 0 · Answer 3 · answered Apr 12 '17 at 11:01

Sample setup:

P = np.random.random((20, 10)) # Initializing P matrix
P = P/P.sum() # Normalizing probability matrix

P.ravel() flattens the probability matrix P.

ij = np.random.choice(range(P.size), p=P.ravel())

ij tells which element from P was randomly selected using weights from probability matrix P

i, j = np.unravel_index(ij, P.shape)

i, j are the coordinates of the selected element in matrix P

randomly choose pair (i, j) with probability P[i, j] given stochastic matrix P

3 Answers3