How to compute the scalar product of matrices as fast as possible

Question

I want to compute (as fast as possible)

Av = v[0] * A[0, :, :] + ... + v[M-1] * A[M-1, :, :]

where v and A are np.ndarrays with shape (M,) and (M, N, N). Here's a minimal example what I have so far:

import numpy as np

N = 1000
M = 100

A = np.random.randint(-10, 10, size=(M, N, N))
v = np.random.randint(-10, 10, size=(M,))

%timeit np.sum(A * v[:, None, None], axis=0)

works as expected and gives

591 ms ± 7.42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

However, I was wondering whether there's a faster way to calculate it?

Almost surely [`einsum`](https://numpy.org/doc/stable/reference/generated/numpy.einsum.html?highlight=einsum#numpy.einsum). I guess `np.einsum('nij,n->ij', A, v)`, but can't write a proper answer right now -- sorry. — phipsgabler, Jul 22 '21 at 09:32

joni · Answer 1 · 2021-08-12T09:57:04.447

You can use np.einsum, see this question and the awesome answers for more details. Timing both on my machine yields:

In [57]: %timeit np.sum(A * v[:, None, None], axis=0)
441 ms ± 7.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [58]: %timeit np.einsum('ijk,i', A, v)
74.6 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Note that np.einsum('ijk,i', A, v) is the same as np.einsum('ijk,i -> jk', A, v).

How to compute the scalar product of matrices as fast as possible

1 Answers1