0

I can wrap my head around the two-dimensional examples, and I have an intuition about 'repeated' dimensions causing multiplication, and omitted dimensions in the explicit output causing summation across the dimension. But I'm trying to parse an algorithm implementation that essentially just consists of einsums on three-dimensional arrays, and I need a manual to unpack the formulas. E.g. I have an array x of shape [F][I][D] and array y of shape [F][I], then einsum('fid,fi->fd', x, y) produces an array of shape [F][D], but I can't figure out the execution order and orientation of multiplications and summations.

x = (np.arange(5*3*2)).reshape(5,3,2)
x

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]],

       [[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]],

       [[24, 25],
        [26, 27],
        [28, 29]]])
y = (np.arange(5*3)).reshape(5,3)
y

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])
np.einsum("fid,fi->fd",x,y)

array([[  10,   13],
       [ 100,  112],
       [ 298,  319],
       [ 604,  634],
       [1018, 1057]])

Is there a manual of how to decompress the einsum string, such that I get to "ordinary people" summation and multiplication formulas?

0__
  • 66,707
  • 21
  • 171
  • 266

2 Answers2

1

As far as I can deduce, the result is

out[f][d] = ∑_i x[f][i][d] * y[f][i]

So looking again at 'fid,fi->fd', since i is not in the RHS, one sums over i, and you just apply the duplicate dimensions for the element wise multiplication.

0__
  • 66,707
  • 21
  • 171
  • 266
0
In [15]: x = (np.arange(5*3*2)).reshape(5,3,2)                                  
In [16]: y = (np.arange(5*3)).reshape(5,3)                                      
In [17]: np.einsum('fid,fi->fd',x,y)                                            
Out[17]: 
array([[  10,   13],
       [ 100,  112],
       [ 298,  319],
       [ 604,  634],
       [1018, 1057]])

Some alternatives, using broadcasting and sum:

In [18]: (x*y[:,:,None]).sum(axis=1)                                            
Out[18]: 
array([[  10,   13],
       [ 100,  112],
       [ 298,  319],
       [ 604,  634],
       [1018, 1057]])

and batched dot:

In [19]: np.array([np.dot(b,a) for a,b in zip(x,y)])                            
Out[19]: 
array([[  10,   13],
       [ 100,  112],
       [ 298,  319],
       [ 604,  634],
       [1018, 1057]])

The summed i dimension is 'sum-of-products' dimension of matrix multiplication. f is the batch dimension, occurring in all terms in the same place.

matmul/@ also does batched matrix multiplication, but its application isn't as intuitive:

In [21]: y[:,None,:]@x                                                          
Out[21]: 
array([[[  10,   13]],

       [[ 100,  112]],

       [[ 298,  319]],

       [[ 604,  634]],

       [[1018, 1057]]])

This is a (5,1,2) that has to be squeezed to get rid of that middle dimension.

hpaulj
  • 221,503
  • 14
  • 230
  • 353