1

If I define a array X with shape (2, 2):

X = np.array([[1, 2], [3, 4]])

and take the kronecker product, then reshape the output using

np.kron(X, X).reshape((2, 2, 2, 2))

I get a resulting matrix:

array([[[[ 1,  2],
         [ 2,  4]],

        [[ 3,  4],
         [ 6,  8]]],


       [[[ 3,  6],
         [ 4,  8]],

        [[ 9, 12],
         [12, 16]]]])

However, when I use np.tensordot(X, X, axes=0) the following matrix is output

array([[[[ 1,  2],
         [ 3,  4]],

        [[ 2,  4],
         [ 6,  8]]],


       [[[ 3,  6],
         [ 9, 12]],

        [[ 4,  8],
         [12, 16]]]])

which is different from the first output. Why is this the case? I found this while searching for answers, however I don't understand why that solution works or how to generalise to higher dimensions.

Nick Bosch
  • 11
  • 3

1 Answers1

2

My first question is, why do you expect them to be same?

Let's do the kron without reshaping:

In [403]: X = np.array([[1, 2],
     ...:               [3, 4]])
     ...:               
In [404]: np.kron(X,X)
Out[404]: 
array([[ 1,  2,  2,  4],
       [ 3,  4,  6,  8],
       [ 3,  6,  4,  8],
       [ 9, 12, 12, 16]])

It's easy to visualize the action.

[X*1, X*2
 X*3, X*4]

tensordot normally is thought of as a generalization of np.dot, able to handle more complex situations than the common matrix product (i.e. sum of products on one or more axes). But here there's no summing.

In [405]: np.tensordot(X,X, axes=0)
Out[405]: 
array([[[[ 1,  2],
         [ 3,  4]],

        [[ 2,  4],
         [ 6,  8]]],


       [[[ 3,  6],
         [ 9, 12]],

        [[ 4,  8],
         [12, 16]]]])

When axes is an integer rather than a tuple, the action is a little tricky to understand. The docs say:

``axes = 0`` : tensor product :math:`a\otimes b`

I just tried to explain what is happening when axes is a scalar (it's not trivial) How does numpy.tensordot function works step-by-step?

Specifying axes=0 is equivalent to providing this tuple:

np.tensordot(X,X, axes=([],[]))

In any case it's evident from the output that this tensordot is producing the same numbers - but the layout is different from the kron.

I can replicate the kron layout with

In [424]: np.tensordot(X,X,axes=0).transpose(0,2,1,3).reshape(4,4)
Out[424]: 
array([[ 1,  2,  2,  4],
       [ 3,  4,  6,  8],
       [ 3,  6,  4,  8],
       [ 9, 12, 12, 16]])

That is I swap the middle 2 axes.

And omitting the reshape, I get the same (2,2,2,2) you get from kron:

np.tensordot(X,X,axes=0).transpose(0,2,1,3)

I like the explicitness of np.einsum:

np.einsum('ij,kl->ijkl',X,X)    # = tensordot(X,X,0)
np.einsum('ij,kl->ikjl',X,X)    # = kron(X,X).reshape(2,2,2,2)

Or using broadcasting, the 2 products are:

X[:,:,None,None]*X[None,None,:,:]   # tensordot 0
X[:,None,:,None]*X[None,:,None,:]   # kron
hpaulj
  • 221,503
  • 14
  • 230
  • 353