Broadcasting np.dot vs tf.matmul for tensor-matrix multiplication (Shape must be rank 2 but is rank 3 error)

Question

Let's say I have the following tensors:

X = np.zeros((3,201, 340))
Y = np.zeros((340, 28))

Making a dot product of X, Y is successful with numpy, and yields a tensor of shape (3, 201, 28). However with tensorflow I get the following error: Shape must be rank 2 but is rank 3 error ...

minimal code example:

X = np.zeros((3,201, 340))
Y = np.zeros((340, 28))
print(np.dot(X,Y).shape) # successful (3, 201, 28)
tf.matmul(X, Y) # errornous

Any idea how to achieve the same result with tensorflow?

Divakar · Accepted Answer · 2017-12-26T08:40:08.467

3

Since, you are working with tensors, it would be better (for performance) to use tensordot there than np.dot. NumPy allows it (numpy.dot) to work on tensors through lowered performance and it seems tensorflow simply doesn't allow it.

So, for NumPy, we would use np.tensordot -

np.tensordot(X, Y, axes=((2,),(0,)))

For tensorflow, it would be with tf.tensordot -

tf.tensordot(X, Y, axes=((2,),(0,)))

Related post to understand tensordot.

edited Dec 26 '17 at 08:40

answered Dec 25 '17 at 14:12

Divakar

218,885
19
262
358

Although your answer probably works (I haven't tested), to create a tuple of a single element, you should do `(element,)`. `(element)` is just the same as `element`, i.e. it doesn't create a tuple. – Hameer Abbasi Dec 25 '17 at 16:05
@HameerAbbasi `tensordot` expects a tuple of two tuples as the axes param. That's what we are feeding here. – Divakar Dec 25 '17 at 16:33
1

That's exactly the point. `((2),(0))` is equivalent to `(2, 0)`, because the inner parantheses are used for grouping rather than creating a tuple. To create a tuple of two tuples do `((2,),(0,))`. – Hameer Abbasi Dec 25 '17 at 16:36
@HameerAbbasi Ah I guess it works just as well with a tuple of scalars as well for this special one axis reduction case. – Divakar Dec 25 '17 at 16:41
OP here: Actually, to make it work I've had to use `((2,),(0,))` – Yuval Atzmon Dec 26 '17 at 08:39
@yuval Ah okay. Edited post. Thanks for the feedback. – Divakar Dec 26 '17 at 08:40

score 1 · Answer 2 · answered Dec 25 '17 at 14:07

Tensorflow doesn't allow for multiplication of matrices with different ranks as numpy does.

To cope with this, you can reshape the matrix. This essentially casts a matrix of, say, rank 3 to one with rank 2 by "stacking the matrices" one on top of the other.

You can use this: tf.reshape(tf.matmul(tf.reshape(Aijk,[i*j,k]),Bkl),[i,j,l])

where i, j and k are the dimensions of matrix one and k and l are the dimensions of matrix 2.

Taken from here.

Broadcasting np.dot vs tf.matmul for tensor-matrix multiplication (Shape must be rank 2 but is rank 3 error)

2 Answers2

Linked