18

I am trying to define a custom op in tensorflow, in which at one point I need to construct a matrix (z) that would contain sums of all combinations of pairs of rows of two matrices (x and y). In general, the numbers of rows of x and y are dynamical.

In numpy it's fairly simple:

import numpy as np
from itertools import product

rows_x = 4
rows_y = 2
dim = 2

x = np.arange(dim*rows_x).reshape(rows_x, dim)
y = np.arange(dim*rows_y).reshape(rows_y, dim)

print('x:\n{},\ny:\n{}\n'.format(x, y))

z = np.zeros((rows_x*rows_y, dim))
print('for loop:')
for i, (x_id, y_id) in enumerate(product(range(rows_x), range(rows_y))):
    print('row {}: {} + {}'.format(i, x[x_id, ], y[y_id, ]))
    z[i, ] = x[x_id, ] + y[y_id, ]

print('\nz:\n{}'.format(z))

returns:

x:
[[0 1]
 [2 3]
 [4 5]
 [6 7]],
y:
[[0 1]
 [2 3]]

for loop:
row 0: [0 1] + [0 1]
row 1: [0 1] + [2 3]
row 2: [2 3] + [0 1]
row 3: [2 3] + [2 3]
row 4: [4 5] + [0 1]
row 5: [4 5] + [2 3]
row 6: [6 7] + [0 1]
row 7: [6 7] + [2 3]

z:
[[  0.   2.]
 [  2.   4.]
 [  2.   4.]
 [  4.   6.]
 [  4.   6.]
 [  6.   8.]
 [  6.   8.]
 [  8.  10.]]

However, I haven't got a clue how to implement anything similar in tensorflow.

I was mainly going through SO and the tensorflow API in hopes of finding a function that would yield combinations of elements of two tensors, or a function that would give permutations of elements of a tensor, but to no avail.

Any suggestions are most welcome.

ponadto
  • 702
  • 7
  • 17

2 Answers2

18

You could simply use the broadcasting ability of tensorflow.

import tensorflow as tf

x = tf.constant([[0, 1],[2, 3],[4, 5],[6, 7]], dtype=tf.float32)
y = tf.constant([[0, 1],[2, 3]], dtype=tf.float32)

x_ = tf.expand_dims(x, 0)
y_ = tf.expand_dims(y, 1)
z = tf.reshape(tf.add(x_, y_), [-1, 2])
# or more succinctly 
z = tf.reshape(x[None] + y[:, None], [-1, 2])

sess = tf.Session()
sess.run(z)
P-Gn
  • 23,115
  • 9
  • 87
  • 104
  • 6
    It's magic... So, to get this right: you're first expanding `x` and `y`, so that `x_` has shape [1, 4, 2], and `y_` has shape [3, 1, 2]. And then, the broadcasting ability of `tf.add` "figures out" how to fill out the dimensions into [3, 4, 2] (which is the shape of `tf.add(x_, y_)`), and finally, `tf.reshape` ensures that we have 2 columns in `z`. The "figuring out" is the key part, and as I'm reading [here](https://www.tensorflow.org/performance/xla/broadcasting): ... – ponadto Apr 21 '17 at 14:32
  • 1
    ... "When two compatible arrays are encountered, the result shape has the maximum among the two inputs at every dimension index.", and then: "A special case arises, and is also supported, where each of the input arrays has a degenerate dimension at a different index. In this case, the result is an "outer operation". That's subtle. Thank you for this answer! – ponadto Apr 21 '17 at 14:32
0

Option 1

Defining z as variable and updating its rows:

import tensorflow as tf
from itertools import product


x = tf.constant([[0, 1],[2, 3],[4, 5],[6, 7]],dtype=tf.float32)
y = tf.constant([[0, 1],[2, 3]],dtype=tf.float32)

rows_x,dim=x.get_shape()
rows_y=y.get_shape()[0]

z=tf.Variable(initial_value=tf.zeros([rows_x*rows_y,dim]),dtype=tf.float32)
for i, (x_id, y_id) in enumerate(product(range(rows_x), range(rows_y))):
    z=tf.scatter_update(z,i,x[x_id]+y[y_id])

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    z_val=sess.run(z)
    print(z_val)

This prints

[[  0.   2.]
 [  2.   4.]
 [  2.   4.]
 [  4.   6.]
 [  4.   6.]
 [  6.   8.]
 [  6.   8.]
 [  8.  10.]]

Option 2

Creating z throw list comprehension:

import tensorflow as tf
from itertools import product


x = tf.constant([[0, 1],[2, 3],[4, 5],[6, 7]],dtype=tf.float32)
y = tf.constant([[0, 1],[2, 3]],dtype=tf.float32)

rows_x,dim=x.get_shape().as_list()
rows_y=y.get_shape().as_list()[0]


z=[x[x_id]+y[y_id] for x_id in range(rows_x) for y_id in range(rows_y)]
z=tf.reshape(z,(rows_x*rows_y,dim))

with tf.Session() as sess:
    z_val=sess.run(z)
    print(z_val)

Comparison: The second solution is around two times faster (only measuring the construction of z in both solutions). In particular, the timings are: first solution: 0.211 seconds, second solution:0.137 seconds.

Miriam Farber
  • 18,986
  • 14
  • 61
  • 76
  • 1
    Construction time is usually an invalid metric of performance, since it happens only once; no one cares about a difference of <1s start-up time. I'd be much more interested in knowing if there was any inference time performance difference. – Multihunter Sep 19 '17 at 04:13