0

So I have setup a 4d numpy array, and I'm trying to preform matrix multiplication row-wise in the 4d array. I'm trying to not use a for loop and np.linalg.multi_dot for the sake of trying to keep the programme as fast as possible. This is what it looks like for my simple test (2x2 of matrices, but my actual problem will have an mxn of matrices).

k1=np.array([[1,2],[3,4]]) 
k2=np.array([[5,6],[7,8]])
k3=np.array([[9,10],[11,12]])
k4=np.array([[13,14],[15,16]])
k5=np.array([[k1,k2],[k3,k4]])

So in the case above, somehow get k1.k2 and k3.k4 within the k5 array (keeping in mind that I only have access to k5). If it helps, each matrix itself will be 2x2. Therefore I would like the output of my code to give me the following:

[[[19 22]
 [43 50]]

[[267 286]
 [323 346]]]

Thank you to anybody willing to waste their time on me and this silly little question!

  • did you try np.multiply(k1,k2) ? – Ran A Jun 28 '22 at 15:10
  • @RanA this is just an example, but i will have at least over 100 in one row, and over 100 in a column, so a shape of (100,100,2,2), the np multiply will become tedious and so will the @ method – FlamingosAreSad Jun 28 '22 at 16:37
  • 1
    Show a working example, even if it is tedious – hpaulj Jun 28 '22 at 17:19
  • 1
    what do you mean by tedious? Requires lots of code, python loops, large broadcasted arrays? `matmul` is designed to work with 4d arrays, treating the first dimensions as 'batch' and sum-of-product on the last 2. `einsum` is also easy to use with larger dimensions. – hpaulj Jun 28 '22 at 20:41

2 Answers2

2
>>> k5 = np.array([[k1, k2], [k3, k4], [k1, k2]])
>>> k5[:, 0] @ k5[:, 1]
array([[[ 19,  22],
        [ 43,  50]],

       [[267, 286],
        [323, 346]],

       [[ 19,  22],
        [ 43,  50]]])
paime
  • 2,901
  • 1
  • 6
  • 17
  • This is nice! But in reality i will have over 100 of these columns, so the @ method will become tedious – FlamingosAreSad Jun 28 '22 at 16:38
  • 1
    How do you define a matmul over != 2 matrices ? Please provide a minimal example that will extend to your case. – paime Jun 28 '22 at 16:48
0

If performance is important not avoid looping (as you mentioned in your question), one of the best methods to do such codes (e.g. np.dot equivalent one), is to use numba accelerator in no python parallel mode without substituting python loops. In this case we can write an equivalent np.matmul as:

@nb.njit(parallel=True)
def matmul(k5):
    matmul_ = np.empty((k5.shape[0], 2, 2), dtype=np.int64)
    for i in nb.prange(k5.shape[0]):
        for j in range(2):
            for k in range(2):
                matmul_[i, j, k] = k5[i, 0, j, 0] * k5[i, 1, 0, k] + k5[i, 0, j, 1] * k5[i, 1, 1, k]
    return matmul_

enter image description here

Ali_Sh
  • 2,667
  • 3
  • 43
  • 66
  • temporary link to the benchmark: https://colab.research.google.com/drive/1tu_YYGgpww_LbBjxBypCt8nI0P8Mu0wj?usp=sharing – Ali_Sh Jun 28 '22 at 18:23