Numpy matrix-vector multiplication with complex elements

Question

A matrix/vector multiplication looks like that and works fine:

A = np.array([[1, 0],
              [0,1]])      # matrix
u = np.array([[3],[9]])    # column vector
U = np.dot(A,u)            # U = A*u
U

Is there a way to keep the same structure if the elements Xx,Xy,Yx,Yy, u,v are 2- or 3-dimensional arrays each ?

A = np.array([[Xx, Xy],
              [Yx,Yy]])
u = np.array([[u],[v]])
U = np.dot(A,u)

The desired result is that and works fine (so the answer to my question is rather 'nice to have'):

U = Xx*u + Xy*v
V = Yx*u + Yy*v

Succes Story

THX to @loopy walt and to @Ivan !

1. it works : np.einsum('ijmn,jkmn->ikmn', A, w) and (w.T@A.T).T , both reproduce the correct contravariant coordinate transformation.

2. minor issue : np.einsum('ijmn,jkmn->ikmn', A, w) and (w.T@A.T).T , both return an array with a pair of square brackets too much (it should be 2-dimensional). See @Ivan 's answer why:

array([[[   0,   50,  200],
        [ 450,  800, 1250],
        [1800, 2450, 3200]]])

3. to optimize : building the block matrix is time consuming:

A = np.array([[Xx, Xy],
              [Yx,Yy]])
w = np.array([[u],[v]])

So the overall return time is for np.einsum('ijmn,jkmn->ikmn', A, w) and (w.T@A.T).T lager. Perhaps can this be optimized.

- p_einstein np.einsum('ijmn,jkmn->ikmn', A, w)        9.07 µs ± 267 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- p_tensor: (w.T@A.T).T                                7.89 µs ± 225 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- p_vect:   [Xx*u1 + Xy*v1, Yx*u1 + Yy*v1]             2.63 µs ± 173 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
- p_vect: including  A = np.array([[Xx, Xy],[Yx,Yy]])  7.66 µs ± 290 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Don't stop there. Create A etc. We can help improve the code, but not start it for you. There are too many choices that we can't make for you. — hpaulj, Aug 17 '21 at 22:03

Ivan · Answer 1 · 2021-08-18T07:56:38.277

This is essentially a matrix block operation which you can generalize from a simple matrix operation. Your example is good because you defined your block matrices as such (I renamed them A_p and u_p to not conflict with matrix u):

A_p = np.array([[Xx, Xy],
                [Yx, Yy]])
u_p = np.array([[u], [v]])

If you look at a standard matrix operation, in terms of shapes you have (r, s) meets (s, t) which ultimately results in a matrix of shape (r, t). The 's' dimension gets reduced in the process. If you're not yet familiar with the Einstein Summation, and np.einsum, you should take a quick look. Given matrices M and N, the matrix operation would look like this:

out = np.zeros((r, t))
for i in range(r):
   for j in range(s):
      for k in range(t):
         out[i, k] += M[i, j]*N[j, k]

This becomes very intuitive when you come to realize how the operation takes place: covering the entire multi-dimension space, we accumulate products on the output matrix.

With np.einsum, this would very similar indeed (here M=A, and N=u):

>>> np.einsum('ij,jk->ik', A, u)
array([[3],
       [9]])

It is useful to think with coordinates: i, j, and k rather than dimension sizes: r, s, and t respectively, just like when using np.einsum above. So I'll keep the former notation for the rest of the answer.

Now looking at the block matrices, input matrices contain blocks of matrices themselves. Matrix M is shape (i, j, (m, n)) (i.e. (i, j, m, n)) where (k, l) refers to the shape of the blocks. Similarly, for N we have (j, k, m, n). We've basically replaced the scalar shaped (,) with a 2x2 block (m, n), that's all.

Therefore you can directly infer that the A_p*u_p operation as being:

>>> np.einsum('ijmn,jkmn->ikmn', A_p, u_p)
array([[[[  0,  50],
         [200, 450]]],


       [[[  0, 110],
         [440, 990]]]])

Which you can compare with the 'hand-made' result:

>>> np.stack((Xx*u + Xy*v, Yx*u + Yy*v))[:, None]

Notice this can be used for any (m, n) block and any matrix size.

You can now try implementing A_p*u_p with a loop just like A*u.

Edit: Expanding on @loopy walt's answer who showed you can achieve the same result with:

>>> (u_p.T @ A_p.T).T

Indeed __matmul__ can handle multi-dimensional arrays, and will do so in the following manner (for a number of dimensions higher than 2): both inputs will be treated as a stack of matrices residing in the last two indexes. The inputs shapes are (i, j, m, n), and (j, k, m, n). In order to use __matmul__, we need to pull the block dimensions - namely (m, n) - in first positions, and leave the matrix dimensions - (i, j) and (j, k) respectively - last. The following steps need to be taken

Transposing will have the effect of inverting the order of dimensions: (n, m, j, i) and (n, m, k, j) respectively.
We then swap the two operands, we manage to compute something in the form of (*, k, j) x (*, j, i) = (*, k, i), where * = (n, m).
Transposing the outputs leads to a shape of (i, k, m, n).

@pyano Seems like you've already accepted an answer! I'm a bit disappointed to find you basically took nothing away from the above. If you had paid just a tiny bit of attention to what's being said, you would understand how to not *return an array with a pair of square brackets too much* as you put it. I find you're interest lacking, and that you would much rather get a fast answer quickly and not worry about the details. — Ivan, Aug 18 '21 at 22:04
I am grateful to have received 2 answers. 2 possible solutions and both are valuable. For accepting an answer there is 1 possibiliy here only, but I have given an upvote to each. For an open community learning I have evaluated the 3 different approaches, indicating your merit. Sorry for disapointing you - this was not my intention. — pyano, Aug 19 '21 at 08:14

score 1 · Accepted Answer · answered Aug 17 '21 at 23:30

Borrowing A_p and u_p from @Ivan I'd like to advertise the following much simpler approach that makes use of __matmul__'s inbuilt broadcasting. All we need to do is reverse the order of dimensions:

(u_p.T@A_p.T).T

This gives the same result as Ivan's einsum but is a bit more flexible giving us full broadcasting without us having to figure out the corresponding format string every time.

np.allclose(np.einsum('ijmn,jkmn->ikmn', A_p, u_p),(u_p.T@A_p.T).T)
# True

Numpy matrix-vector multiplication with complex elements

Succes Story

2 Answers2