2

I have a list of 3D points p stored in an ndarray with shape (N, 3). I want to compute the outer product for each 3d point with itself:

N = int(1e4)
p = np.random.random((N, 3))
result = np.zeros((N, 3, 3))
for i in range(N):
    result[i, :, :] = np.outer(p[i, :], p[i, :])

Is there a way to compute this outer product without any python-level loops? The problem is that np.outer does not support anything like an axis argument.

aph
  • 1,765
  • 2
  • 19
  • 34

3 Answers3

7

You can use broadcasting:

p[..., None] * p[:, None, :]

This syntax inserts an axis at the end of the first term (making it Nx3x1) and the middle of the second term (making it Nx1x3). These are then broadcast and yield an Nx3x3 result.

Paul Panzer
  • 51,835
  • 3
  • 54
  • 99
  • 1
    Wow, elegant solution. Thanks! – aph Dec 04 '17 at 01:05
  • Can you help me figure out why my solution works? It's even faster, but I don't understand the concept of broadcasting in `einsum`. – Sebastian Mendez Dec 04 '17 at 01:30
  • the dots tell einsum you are only interested in the rightmost axes and want the rest left alone. i and j refer to the last axes; that they are different tells einsum to create separate axes for them in the output, so in short: nothing or ellipsis -> normal broadcasting; same letter in both arguments -> reduction along this axis; letter occurs in only one term -> separate axis in output – Paul Panzer Dec 04 '17 at 01:51
  • @PaulPanzer Thank you. Is there any way to not use the dots? There should be, because the amount of dimensions is fixed, right? – Sebastian Mendez Dec 04 '17 at 04:46
  • @Sebastian You can use `'ij,ik->ijk'` instead. What do you mean by the amount of dimensions is fixed? – Paul Panzer Dec 04 '17 at 09:40
  • I meant each array `p` and `p` only have two dimensions, and the ellipsis notation is meant for applying `einsum` to arrays which do not always have the same dimensions. For example, my function works with arrays of one dimension (which is identical to `np.outer`) and 3 dimensions (which I have no comparison to). – Sebastian Mendez Dec 04 '17 at 17:41
2

A much better solution than my previous one is using np.einsum:

np.einsum('...i,...j', p, p)

which is even faster than the broadcasting approach:

In [ ]: %timeit p[..., None] * p[:, None, :]
514 µs ± 4.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [ ]: %timeit np.einsum('...i,...j', p, p)
169 µs ± 1.75 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

As for how it works I'm not quite sure, I just messed around with einsum until I got the answer I wanted:

In [ ]: np.all(np.einsum('...i,...j', p, p) == p[..., None] * p[:, None, :])
Out[ ]: True
Sebastian Mendez
  • 2,859
  • 14
  • 25
  • It seems that whenever asking "How can I do ?" there's always an answer involving `einsum` that is faster than any other approach. I can never understand the documentation well enough but as far as I'm concerned it's a magic box that solves every array operation and manipulation routine provided someone else tells you how on SO. – alkasm Dec 04 '17 at 03:01
  • 1
    @AlexanderReynolds I have no clue how it works either. All I know is that if you're only adding or multiplying parts of arrays, `einsum` will do anything other `numpy` functions can't. – Sebastian Mendez Dec 04 '17 at 04:47
  • 1
    It's worth digging into that `einsum` documentation, espeically if you have any prior knowledge of Einstein summation notation, although often Divakar will come along with something even faster using `@` or `np.dot` – Daniel F Dec 04 '17 at 07:51
  • Basic context of Einstein notation is to muliply over indices that are repeated, and sum over indices lost. So or example `'ij, jk -> ij'` multiplies the `j` dimensions, and then sums over the `k` dimension - while `'ij, jk - > `ij, jk -> ik` multiples *and* sums over the `j` dimensions. – Daniel F Dec 04 '17 at 08:00
  • 1
    @DanielF _although often Divakar will come along with something even faster using @ or np.dot_ ... or `tensordot`. – Paul Panzer Dec 04 '17 at 09:50
  • @PaulPanzer Or just, y'know, *broadcasting* – Daniel F Dec 04 '17 at 11:56
0

You could at least use apply_along_axis:

result = np.apply_along_axis(lambda point: np.outer(point, point), 1, p)

Surprisingly, however, this is in fact slower than your method:

In [ ]: %%timeit N = int(1e4); p = np.random.random((N, 3))
   ...: result = np.apply_along_axis(lambda point: np.outer(point, point), 1, p)
61.5 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [ ]: %%timeit N = int(1e4); p = np.random.random((N, 3))
   ...: result = np.zeros((N, 3, 3))
   ...: for i in range(N):
   ...:     result[i, :, :] = np.outer(p[i, :], p[i, :])
46 ms ± 709 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sebastian Mendez
  • 2,859
  • 14
  • 25