Is there a way to do matrix-by-tensor multiply?

Question

Is there any way to get tensor (with batch dimension) multiplication behavior similar to tf.matmul between 2D-matrix where the batch dimension equal to one ?

Specifically, I want to do 2D-matrix (6,255) and Tensor (2,255, 255,1) (with batch dimension equal to 2), where :

import tensorflow as tf
import numpy as np

im = np.random.rand(255, 255)
A = np.random.rand(6, 255)
B = np.array([im,im]).reshape([-1,255, 255,1])

batch_size = 2
a = tf.placeholder(tf.float64,shape=(6, 255))
b = tf.placeholder(tf.float64,shape=(batch_size,255, 255,1))
out_mat = tf.matmul(a,b) #Didn't work 
with tf.Session() as sess:
    sess.run(out_mat, feed_dict={a: A, b: B})

and the result should have (2, 6, 255, 1) (Thanks you @rvinas ) shape.

Note: In tensorflow, matmul can only handle 2D-matrices and batch_matmul can only do (...,m,n) by (...,n,p) where ... is the same in both A,B.

How would you get an output shape of `(2, 6, 6, 1)`? Did you mean `(2, 6, 255, 1)`? Could you please provide an example using NumPy? — rvinas, Oct 22 '18 at 09:06
This might help: https://stackoverflow.com/a/51326380/1735003 — P-Gn, Oct 22 '18 at 09:31
@rvinas, You are right I mean (2, 6, 255, 1), Let's consider numpy example without batch size=2, im = np.random.rand(255, 255) A = np.random.rand(6, 255) B = im C = np.matmul(A,B) C shape will be (6,255), I'm trying to reproduce the same results for tensors when B shape is (2, 255, 255, 1) — P. Max, Oct 22 '18 at 17:35

rvinas · Accepted Answer · 2018-10-24T06:15:39.777

Here's one way to do it using implicit broadcasting and tf.reduce_sum:

import tensorflow as tf
import numpy as np

batch_size = 2
dim_1 = 3
dim_2 = 4
dim_3 = 5
dim_4 = 32

im = np.arange(dim_2 * dim_3).reshape(dim_2, dim_3)
A = np.arange(dim_1 * dim_2).reshape(dim_1, dim_2)
B = im
C = np.matmul(A, B)
print('NumPy result (for batch_size=1):\n {}'.format(C))

B = np.repeat(B[None, ..., None], batch_size, axis=0)
B = np.repeat(B, dim_4, axis=3)
print(B.shape)  # B shape=(batch_size, dim_2, dim_3, dim_4)

a = tf.placeholder(tf.float64, shape=(dim_1, dim_2))
b = tf.placeholder(tf.float64, shape=(batch_size, dim_2, dim_3, dim_4))
a_ = a[None, :, :, None, None]  # Shape=(1, dim_1, dim_2, 1, 1)
b_ = b[:, None, :, :, :]  # Shape=(batch_size, 1, dim_2, dim_3, dim_4)
out_mat = tf.reduce_sum(a_ * b_, axis=2)

with tf.Session() as sess:
    c = sess.run(out_mat, feed_dict={a: A, b: B})
    print('TF result (for batch_size={}):\n {}'.format(batch_size, c))
    assert c.shape == (batch_size, dim_1, dim_3, dim_4)

And an alternative way using tf.matmul, tf.reshape and tf.transpose:

b_ = tf.transpose(b, [1, 0, 2, 3])  # Shape=(dim_2, batch_size, dim_3, dim_4)
b_ = tf.reshape(b_, [dim_2, -1])  # Shape=(dim_2, batch_size * dim_3 * dim_4)
matmul = a @ b_  # Shape=(dim_1, batch_size * dim_3 * dim_4)
matmul_ = tf.reshape(matmul, [dim_1, batch_size, dim_3, dim_4])
out_mat = tf.transpose(matmul_, [1, 0, 2, 3])  # Shape=(batch_size, dim_1, dim_3, dim_4)

For your particular example, you would set batch_size=2, dim_1=6, dim_2=255, dim_3=255 and dim_4=1.

Thank you @rvinas, I got it, but If we suppose that b_ `Shape=(batch_size, 1, dim_2, dim_3, 32)` have a deep of 32 dimensions as tensors what is the right shape for a_ `Shape=(?, dim_1, dim_2, ?, ?)` and to perform the multiply? I can't get the idea of multiply tensor-by-tensor from the matrix-by-tensor idea! Can you give us the same demonstration using a deep dimension (a.k.a filters)? — P. Max, Oct 23 '18 at 22:16
You're welcome! In this case, the shape of `a_` would also be `(1, dim_1, dim_2, 1, 1)`. This allows for implicit broadcasting in the last axis. Broadcasting in TF is similar to [broadcasting in NumPy](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). I updated the examples to account for a general `dim_4`. — rvinas, Oct 24 '18 at 06:20
Oh, right... I got the idea of broadcasting in NumPy and in TF, simple and easy. Appreciated. — P. Max, Oct 24 '18 at 21:23

Is there a way to do matrix-by-tensor multiply?

1 Answers1