Tensorflow Update first matching element in each row

Question

Building on this question I am looking to update the values of a 2-D tensor the first time in a row the tf.where condition is met. Here is a sample code I am using to simulate:

tf.reset_default_graph()
graph = tf.Graph()
with graph.as_default():
    val = "hello"
    new_val = "goodbye"
    matrix = tf.constant([["word","hello","hello"],
                          ["word", "other", "hello"],
                          ["hello", "hello","hello"],
                          ["word", "word", "word"]
                         ])
    matching_indices = tf.where(tf.equal(matrix, val))
    first_matching_idx = tf.segment_min(data = matching_indices[:, 1],
                                 segment_ids = matching_indices[:, 0])

sess = tf.InteractiveSession(graph=graph)
print(sess.run(first_matching_idx))

This will output [1, 2, 0] where the 1 is the placement of the first hello in row 1, the 2 is the placement of the first hello in row 2, and the 0 is the placement of the first hello in row 3.

However, I can't figure out a way to get the first matching index to be updated with the new value -- basically I want the first "hello" to be turned into "goodbye". I have tried using tf.scatter_update() but it does not seem to work on 2D tensors. Is there any way to modify the 2-D tensor as described?

Your idea of using scatter_update seems promising. The documentation of scatter_update seems to suggest it works with high dimensional tensors. Maybe still worth to find out what the specific problem is. Note that the problem could be transformed to a 1D problem as well by offsetting the index by the row index * row length. — Yao Zhang, Aug 15 '17 at 16:59

score 0 · Answer 1 · answered Aug 15 '17 at 10:17

One easy workaround is to use tf.py_func with numpy array

def ch_val(array, val, new_val):
    idx = np.array([[s, list(row).index(val)]
                for s, row in enumerate(array) if val in row])
    idx = tuple((idx[:, 0], idx[:, 1]))
    array[idx] = new_val
    return array

...
matrix = tf.Variable([["word","hello","hello"],
                      ["word", "other", "hello"],
                      ["hello", "hello","hello"],
                      ["word", "word", "word"]
                     ])
matrix = tf.py_func(ch_val, [matrix, 'hello', 'goodbye'], tf.string) 
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(matrix))
    # results: [['word' 'goodbye' 'hello']
      ['word' 'other' 'goodbye']
      ['goodbye' 'hello' 'hello']
      ['word' 'word' 'word']]

Thanks! Would py_func cause any speed issues at training time? If this is being performed for every batch I am worried the python function will slow things down significantly — reese0106, Aug 15 '17 at 13:20
it won't slow down significantly, but will be slower than optimal version without `tf.py_func` — Ishant Mrinal, Aug 15 '17 at 14:07

Tensorflow Update first matching element in each row

1 Answers1

Linked