Matrix multiplication while subsetting elements from matrices and storing in a new matrix

Question

I am attempting a numpy.matmul call using as variables

Matrix A of dimensions (p, t, q)
Matrix B of dimensions (r, t).
A categories vector of shape r and p categories, used to take slices of B and define the index of A do use.

The multiplications are done iteratively using the indices of each category. For each category p_i, I extract from A a submatrix (t, q). Then, I multiply those with a subset of B (x, t), where x is a mask defined by r == p_i. Finally, the matrix multiplication of (x, t) and (t, q) produces the output (x, q) which is stored at S[x].

I have noted that I cannot figure out a non-iterative version of this algorithm. The first snippet describes an iterative solution. The second one is an attempt at what I would wish to get, where everything is calculated as a single-step and would be presumably faster. However, it is incorrect because matrix A has three dimensions instead of two. Maybe there is no way to do this in NumPy with a single call, and in general, looking for advice/ideas to try out.

Thanks!

import numpy as np
p, q, r, t = 2, 9, 512, 4
# data initialization (random)
np.random.seed(500)
S = np.random.rand(r, q)
A = np.random.randint(0, 3, size=(p, t, q))
B = np.random.rand(r, t)
categories = np.random.randint(0, p, r)

print('iterative') # iterative
for i in range(p):
    # print(i)
    a = A[i, :, :]
    mask = categories == i
    b = B[mask]
    print(b.shape, a.shape, S[mask].shape,
          np.matmul(b, a).shape)
    S[mask] = np.matmul(b, a)

print(S.shape)

a simple way to write it down

S = np.random.rand(r, q)
print(A[:p,:,:].shape)
result = np.matmul(B, A[:p,:,:])

# iterative assignment
i = 0
S[categories == i] = result[i, categories == i, :]
i = 1
S[categories == i] = result[i, categories == i, :]

The next snippet will produce an error during the multiplication step.

# attempt to multiply once, indexing all categories only once (not possible)
np.random.seed(500)
S = np.random.rand(r, q)
# attempt to use the categories vector
a = A[categories, :, :]
b = B[categories]
# due to the shapes of the arrays, this multiplication is not possible
print('\nsingle step (error due to shapes of the matrix a')
print(b.shape, a.shape, S[categories].shape)
S[categories] = np.matmul(b, a)
print(scores.shape)

iterative
(250, 4) (4, 9) (250, 9) (250, 9)
(262, 4) (4, 9) (262, 9) (262, 9)
(512, 9)


single step (error due to shapes of the 2nd matrix a).
(512, 4) (512, 4, 9) (512, 9)

what is the final output you are looking for? `np.sum(np.matmul(b, a)))`? lets say you did this without any explicit iteration, what would be the shape of this array — Akshay Sehgal, Dec 17 '22 at 23:27
@AkshaySehgal the final output I expect has shape `(512, 9)`. By step, individual submatrices `(250, 4)` and `(262, 4)` extracted from B are multiplied each by two different matrices `(4, 9)` extracted from A, and the results `(250, 9)` and `(262, 9)` are then stored by index into `S`. — ilibarra, Dec 17 '22 at 23:34

hpaulj · Accepted Answer · 2022-12-18T00:34:55.823

1

In [63]: (np.ones((512,4))@np.ones((512,4,9))).shape
Out[63]: (512, 512, 9)

This because the first array is broadcasted to (1,512,4). I think you want instead to do:

In [64]: (np.ones((512,1,4))@np.ones((512,4,9))).shape
Out[64]: (512, 1, 9)

Then remove the middle dimension to get a (512,9).

Another way:

In [72]: np.einsum('ij,ijk->ik', np.ones((512,4)), np.ones((512,4,9))).shape
Out[72]: (512, 9)

edited Dec 18 '22 at 00:34

answered Dec 18 '22 at 00:00

hpaulj

221,503
14
230
353

Thank you very much! This does the work and it is faster than the OP. – ilibarra Dec 18 '22 at 10:27

score 0 · Answer 2 · answered Dec 18 '22 at 01:14

0

To remove the loop altogether, you can try this

bigmask = np.arange(p)[:, np.newaxis] == categories
C = np.matmul(B, A)
res = C[np.broadcast_to(bigmask[..., np.newaxis], C.shape)].reshape(r, q)

# `res` has the same rows as the iterative `S` but in the wrong order
# so we need to reorder the rows
sort_index = np.argsort(np.broadcast_to(np.arange(r), bigmask.shape)[bigmask])
assert np.allclose(S, res[sort_index])

Though I'm not sure it's much faster than the iterative version.

answered Dec 18 '22 at 01:14

kmkurn

611
1
13

Thanks for your input @kmkurn. Unfortunately, this is slower than the OP. – ilibarra Dec 18 '22 at 10:23
@ilibarra No worries. I had missed your snippet. – kmkurn Dec 18 '22 at 22:14

Matrix multiplication while subsetting elements from matrices and storing in a new matrix

2 Answers2