4

I am trying to dive into the einsum notation. This question and answers have helped me a lot.

But now I can't grasp the machinery of the einsum when calculating outer product:

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
np.einsum('i,j->ij', x, y)

array([[ 4.,  5.,  6.],
        [ 8., 10., 12.],
        [12., 15., 18.]])

That answer gives a following rule:

By repeating the label i in both input arrays, we are telling einsum that these two axes should be multiplied together.

I can't understand how this multiplication happened if we hadn't provided any repeated axis label in np.einsum('i,j->ij', x, y)?

Could you please give a steps that np.einsum took in this example?

Or maybe more broader question how einsum works when no matching axis labels are given?

Kenenbek Arzymatov
  • 8,439
  • 19
  • 58
  • 109

2 Answers2

4

In the output of np.einsum('i,j->ij', x, y), element [i,j] is simply the product of element i in x and element j in y. In other words, np.einsum('i,j->ij', x, y)[i,j] = x[i]*y[j].

Compare it to np.einsum('i,i->i', x, y) were element i of output is x[i]*y[i]:

np.einsum('i,i->i', x, y)

[ 4 10 18]

And if a label in input is missing in output, it means the output has calculated the sum along the missing labels axis. Here is a simple example:

np.einsum('i,j->i', x, y)

[15 30 45]

Here the label j in input is missing in output, which is equivalent to summation along axis=1 (corresponding to label j):

np.sum(np.einsum('i,j->ij', x, y), axis=1)

[15 30 45]
Crazy Coder
  • 414
  • 3
  • 6
1

In general, you can understand the einsum first by know exactly the dimension or the shape of the input and output the einsum notation is expected and calculating.

To facilitate the explanation, let say x.shape = [3] , y.shape = [4]

x = np.array([1, 2, 3])
y = np.array([4, 5, 6, 7])
np.einsum('i,j->ij', x, y)
array([[ 4,  5,  6,  7],
       [ 8, 10, 12, 14],
       [12, 15, 18, 21]])

Dimensionality

For cross product np.einsum('i,j->ij', x, y), the 1st input is a single character i. You can think the number of character is the number of dimenion of that input. So here the first input x has only 1 dimension. Same for j, the 2nd input is also just one character j so it has only 1 dimension. Lastly the output ij has 2 characters, so it has 2 dimension and that dimension must be [3,4], because the number of element in the first input is i which has 3 elements, and the number of element in the 2nd input is j which has 4 elements.

Each Element in the result array

Then, you focus will be on the result notation ij. Now we know that it is a 2D array, or a 3 by 4 matrix, ij is talking about how does ONE element calculated in the location of i row j column. Element must be calculated from product of inputs. Here means that particular element on location [i,j] is the product of input a of it's location i and input b of it's location j

So, element on location [0,0] is the calculated by taking 1st input location 0, which is your x[0] = 1,and the 2nd input location [0], which is y[0] = 4, the result of that ONE element [0,0] = 1 * 4 = 4.

Same, element on the result location [2,3] is taking the x[2] and y[3] = 3 * 7 = 21

In short, think ij of i,j->ij to be i times j of that ONE element of a result of 2 dimensions (because of 2 characters). The actually element you take from input i and input j is according to the location index of ij

You can find the transpose of the outer product in one line

That means, the transpose of the outer product is simply as i,j->ji. Here, we have two characters in the result so it is a 2D array. The number of element of the 1st dimention must be size of j, because it come first. and it is the 2nd input which has 4 elements. Same logic for the 2nd dimension so we know that the resulting array is the shape of (4,3).

Then, the ONE element at location of [3,2] of the result 2D array, is ji, meaning input j times input i, so it is the element 3 of j = y[3] = 7 , and the element 2 of i = x[2] = 3. The result is 7 * 3 = 21

Hence, the result is

np.einsum('i,j->ji', x, y)
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18],
       [ 7, 14, 21]])
palazzo train
  • 3,229
  • 1
  • 19
  • 40