10

I am trying to vectorize the following code:

for i in xrange(s.shape[0]):
            a[i] = np.argmax(np.random.multinomial(1,s[i,:]))

s.shape = 400 x 100 [given].

a.shape = 400 [expected].

s is a 2D matrix, which contains the probabilities of pairs. The multinomial is expected to draw a random sample from each row of the s matrix and store the result in vector a.

Divakar
  • 218,885
  • 19
  • 262
  • 358
Don Jacob
  • 101
  • 4
  • Are all the rows of `s` guaranteed to have a sum of `1`? – Divakar Apr 30 '16 at 08:34
  • 2
    Here's an attempt: http://stackoverflow.com/questions/34187130/fast-random-weighted-selection-across-all-rows-of-a-stochastic-matrix/34190035 – ayhan Apr 30 '16 at 09:12

2 Answers2

3

In the comments, it is said that there is an attempt at vectorizing this here, however, it's not only an attempt. It is a complete solution the this question too.

The goal of the question is to obtain the index of the postion containing the 1 of the multinomial event. That is, the following realization [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0] will yield 14. Thus, it is actually equivalent to executing:

np.random.choice(np.arange(len(p)),p=p) # here, p is s[i,:]

Therefore, Warren Weckesser solution to Fast random weighted selection across all rows of a stochastic matrix is also a solution to this question. The only difference is whether the vectors of probability are defined in rows or in columns, which can be solved easily either transposing s to be used as prob_matrix or defining a custom version of vectorized that works for s strtucture:

def vectorized(prob_matrix, items):
    s = prob_matrix.cumsum(axis=1)
    r = np.random.rand(prob_matrix.shape[0])
    k = (s < r).sum(axis=1)
    return items[k]

In this question, with dimensions 400x400, the speedup is around a factor 10:

%%timeit
a = np.empty(400)
for i in range(s.shape[0]):
    a[i] = np.argmax(np.random.multinomial(1,s[i,:]))
# 5.96 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%%timeit 
vals = np.arange(400,dtype=int)
vectorized(s,vals)
# 544 µs ± 5.49 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
OriolAbril
  • 7,315
  • 4
  • 29
  • 40
  • 1
    There's a small typo with `np.random_choice` which should be `np.random.choice`. I wasn't able to edit as it is just a 1 character difference. – universvm Apr 23 '22 at 15:48
-2

How about

[np.argmax(np.random.multinomial(1,s[i,:])) for i in xrange(s.shape[0])]
Antimony
  • 2,230
  • 3
  • 28
  • 38