Here's a generalization of my answer in Fast random weighted selection across all rows of a stochastic matrix:
def vectorized_choice(p, n, items=None):
s = p.cumsum(axis=1)
r = np.random.rand(p.shape[0], n, 1)
q = np.expand_dims(s, 1) >= r
k = q.argmax(axis=-1)
if items is not None:
k = np.asarray(items)[k]
return k
p
is expected to be a two-dimensional array whose rows are probability vectors. n
is the number of samples to draw from the distribution defined by each row. If items
is None, the samples are integers in range(0, p.shape[1])
. If items
is not None, it is expected to be a sequence with length p.shape[1]
.
Example:
In [258]: p = np.array([[0.1, 0.5, 0.4], [0.75, 0, 0.25], [0, 0, 1], [1/3, 1/3, 1/3]])
In [259]: p
Out[259]:
array([[0.1 , 0.5 , 0.4 ],
[0.75 , 0. , 0.25 ],
[0. , 0. , 1. ],
[0.33333333, 0.33333333, 0.33333333]])
In [260]: vectorized_choice(p, 20)
Out[260]:
array([[1, 1, 2, 1, 1, 2, 2, 2, 1, 2, 1, 1, 1, 2, 2, 0, 1, 2, 2, 2],
[0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[1, 0, 2, 2, 0, 1, 2, 1, 0, 0, 0, 0, 2, 2, 0, 0, 2, 1, 1, 2]])
In [261]: vectorized_choice(p, 20, items=[1, 2, 3])
Out[261]:
array([[2, 1, 2, 2, 2, 3, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, 2, 3, 2, 2],
[1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 1, 1, 3, 3, 1, 3, 1, 1, 1],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
[3, 3, 3, 1, 3, 2, 1, 2, 3, 1, 2, 2, 3, 2, 1, 2, 1, 2, 2, 2]])
Timing for p
with shape (1000000, 3)
:
In [317]: p = np.random.rand(1000000, 3)
In [318]: p /= p.sum(axis=1, keepdims=True)
In [319]: %timeit vectorized_choice(p, 20, items=np.arange(1, p.shape[1]+1))
1.89 s ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Here's the timing for Divakar's function:
In [320]: %timeit random_choice_prob_vectorized(p, 20, choice=np.arange(1, p.shape[1]+1))
7.33 s ± 43.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The difference will be less pronounced if you increase the number of columns in p
, and if you make the number of columns big enough, Divakar's function will be faster. E.g.
In [321]: p = np.random.rand(1000, 120)
In [322]: p /= p.sum(axis=1, keepdims=True)
In [323]: %timeit vectorized_choice(p, 20, items=np.arange(1, p.shape[1]+1))
6.41 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [324]: %timeit random_choice_prob_vectorized(p, 20, choice=np.arange(1, p.shape[1]+1))
6.29 ms ± 342 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)