0

I am trying to multiply a set of vectors with corresponding matrices and would like to sum the resulting vectors at the end. As a numpy example let's assume we have 20 vectors and matrices of sizes 10x1 and and 150x1 respectively:

import numpy as np
np_b=[ np.random.rand(10) for i in range(20)]
np_A=[ np.random.rand(150,10) for i in range(20)]
#first we multiply each vector with it's corresponding matrix
np_allMuls=np.array([np.dot(np_A[i],np_b[i]) for i in range(20)] ) 
#then we sum all of the vectors to get the 150 dimensional sum vector
np_allSum=np.sum( np_allMuls,axis=0 )

So far with tensorflow 0.10 I got:

import tensorflow as tf
tf_b = tf.placeholder("float", [None,10])
tf_A= tf.placeholder("float", [None,150,10])
#the following gives me ValueError: Shape (?, 150, 10) must have rank 2
tf_allMuls=tf.matmul(tf_A,tf_b)

But this symbolic multiplication gives me the error "ValueError: Shape (?, 150, 10) must have rank 2".

Does anyone know why I am getting such an error message? How can I get tf_allMuls correctly?

MajorYel
  • 379
  • 1
  • 5
  • 10

1 Answers1

1

From the documentation on tf.matmul:

The inputs must be matrices (or tensors of rank > 2, representing batches of matrices), with matching inner dimensions, possibly after transposition.

Considering that you are using None as the first argument to your placeholders, the second option is relevant for you, i.e. "batches of matrices". But your tf_b is a batch of vectors, not matrices, so the ranks of the two matrices are not the same, which is why you get the error. You should use instead:

tf_b = tf.placeholder("float", [None, 10, 1])
tf_A = tf.placeholder("float", [None, 150, 10])
tf_allMuls = tf.matmul(tf_A, tf_b)

It seems thus that matmul is not able to broadcast (may be check this post) and I agree that the error message you get is a bit misleading.

Here is a simple example:

tf_b = tf.placeholder("float", [None, 3, 1])
tf_A = tf.placeholder("float", [None, 3, 3])
tf_allMuls = tf.matmul(tf_A, tf_b)

with tf.Session() as sess:
    b = np.array([1, 2, 3])[np.newaxis, :, np.newaxis]
    A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])[np.newaxis, ...]
    muls = sess.run(tf_allMuls, feed_dict={tf_b: b, tf_A: A})
    print muls

which prints

[[[ 14.]
  [ 32.]
  [ 50.]]]

Also note that the order of the arguments for tf.matmul matters, much like you are used to from actual matrix multiplication. So while this

tf_b = tf.placeholder("float", [None, 1, 3])
tf_A = tf.placeholder("float", [None, 3, 3])
tf_allMuls = tf.matmul(tf_A, tf_b)

NOT work, the following would (of course, it is not computing the same thing, but it does not raise an error):

tf_b = tf.placeholder("float", [None, 1, 3])
tf_A = tf.placeholder("float", [None, 3, 3])
tf_allMuls = tf.matmul(tf_b, tf_A)
Community
  • 1
  • 1
kafman
  • 2,862
  • 1
  • 29
  • 51