12

Suppose I have the following tensor t as the output of a softmax function:

t = tf.constant(value=[[0.2,0.8], [0.6, 0.4]])
>> [ 0.2,  0.8]
   [ 0.6,  0.4]

Now I would like to convert this matrix t into a matrix that resembles the OneHot encoded matrix:

Y.eval()
>> [   0,    1]
   [   1,    0]

I am familiar with c = tf.argmax(t) that would give me the indices per row of t that should be 1. But to go from c to Y seems quite difficult.

What I already tried was converting t to tf.SparseTensor using c and then using tf.sparse_tensor_to_dense() to get Y. But that conversion involves quite some steps and seems overkill for the task - I haven't even finished it completely but I am sure it can work.

Is there any more appropriate/easy way to make this conversion that I am missing.

The reason why I need this is because I have a custom OneHot encoder in Python where I can feed Y. tf.one_hot() is not extensive enough - doesn't allow custom encoding.

Related questions:

Community
  • 1
  • 1
Davor Josipovic
  • 5,296
  • 1
  • 39
  • 57

2 Answers2

11

Why not combine tf.argmax() with tf.one_hot().

Y = tf.one_hot(tf.argmax(t, dimension = 1), depth = 2)

chasep255
  • 11,745
  • 8
  • 58
  • 115
  • That is so obvious! Really feel a dork for not seeing that sooner. And I think you are correct. Although `tf.one_hot()` is rather meant for categorizing than for assigning 1 to indices, it does seem to actually do the latter (from my limited testing) as well given the depth is specified. Thank you! – Davor Josipovic Jul 29 '16 at 10:34
2

I have compared five ways to do the conversion with an input shape (20, 256, 256, 4) in TensorFlow 2.1.0, with the following average time per conversion in a Quadro RTX 8000.

one_hot-argmax (0.802 us):

    y = tf.one_hot(tf.argmax(x, axis=3), x.shape[3])

cast-reduce_max (0.719 us):

y = tf.cast(tf.equal(x, tf.reduce_max(x, axis=3, keepdims=True)),
            tf.float32)

cast-tile-reduce_max (0.862 us)

y = tf.cast(tf.equal(x, tf.tile(tf.reduce_max(x, axis=3, keepdims=True),
                                [1, 1, 1, x.shape[3]])),
            tf.float32)

where-reduce_max (1.850 us):

y = tf.where(tf.equal(x, tf.reduce_max(x, axis=3, keepdims=True)),
             tf.constant(1., shape=x.shape),
             tf.constant(0., shape=x.shape))

where-tile-reduce_max (1.691 us):

y = tf.where(tf.equal(x, tf.tile(tf.reduce_max(x, axis=3, keepdims=True),
                                 [1, 1, 1, x.shape[3]])),
             tf.constant(1., shape=x.shape),
             tf.constant(0., shape=x.shape))

The code used to generate these results is below:

import time
import tensorflow as tf

shape = (20, 256, 256, 4)
N = 1000

def one_hot():
    for i in range(N):
        x = tf.random.normal(shape)
        x = tf.nn.softmax(tf.random.normal(shape), axis=3)
        x = tf.one_hot(tf.argmax(x, axis=3), x.shape[3])
    return None
    
def cast_reduce_max():
    for i in range(N):
        x = tf.random.normal(shape)
        x = tf.nn.softmax(tf.random.normal(shape), axis=3)
        x = tf.cast(tf.equal(x, tf.reduce_max(x, axis=3, keepdims=True)),
                    tf.float32)
    return None

def cast_tile():
    for i in range(N):
        x = tf.random.normal(shape)
        x = tf.nn.softmax(tf.random.normal(shape), axis=3)
        x = tf.cast(tf.equal(x, tf.tile(tf.reduce_max(x, axis=3, keepdims=True), [1, 1, 1, x.shape[3]])),
                    tf.float32)
    return None    
    
def where_reduce_max():
    for i in range(N):
        x = tf.random.normal(shape)
        x = tf.nn.softmax(tf.random.normal(shape), axis=3)
        x = tf.where(tf.equal(x, tf.reduce_max(x, axis=3, keepdims=True)),
                     tf.constant(1., shape=x.shape),
                     tf.constant(0., shape=x.shape))
    return None

def where_tile():
    for i in range(N):
        x = tf.random.normal(shape)
        x = tf.nn.softmax(tf.random.normal(shape), axis=3)
        x = tf.where(tf.equal(x, tf.tile(tf.reduce_max(x, axis=3, keepdims=True), [1, 1, 1, x.shape[3]])),
                     tf.constant(1., shape=x.shape),
                     tf.constant(0., shape=x.shape))
    return None

def blank():
    for i in range(N):
        x = tf.random.normal(shape)
        x = tf.nn.softmax(tf.random.normal(shape), axis=3)
    return None

t0 = time.time()
one_hot()
print(f"one_hot:\t{time.time()-t0}")    

t0 = time.time()
cast_reduce_max()
print(f"cast_reduce_max:\t{time.time()-t0}")

t0 = time.time()
cast_tile()
print(f"cast_tile:\t{time.time()-t0}")

t0 = time.time()
where_reduce_max()
print(f"where_reduce_max:\t{time.time()-t0}")

t0 = time.time()
where_tile()
print(f"where_tile:\t{time.time()-t0}")

t0 = time.time()
blank()
print(f"blank:\t{time.time()-t0}")
toliveira
  • 1,533
  • 16
  • 27
  • Excellent. The relative performance of `cast_reduce_max` is actually even much better on RTX 2080 Ti (x3.35). – user209974 Aug 21 '22 at 10:56
  • A subtle difference, though, is that `cast_reduce_max` does not guarantee that you get a single prediction. Might not happen in a lifetime, but then again. – user209974 Aug 21 '22 at 11:02