1

The image below is a sample semantic map from the Cityscapes Dataset. It's provided in the form of an RGB image where each specific colour represents a class.

In some deep learning tasks, we would like to map this into a one hot encoding. For example, if it has 20 classes, then this image would be mapped from H x W x 3 to H x W x 20.

How do we do this in TensorFlow?

aachen_000000_000019_gtFine_color.png

jkschin
  • 5,776
  • 6
  • 35
  • 62

1 Answers1

4

My solution is below. Looking forward to suggestions on how to make this more efficient or perhaps an answer that's more efficient.

import tensorflow as tf
import numpy as np
import scipy.misc

img = scipy.misc.imread('aachen_000000_000019_gtFine_color.png', mode = 'RGB')
palette = np.array(
[[128,  64, 128],
 [244,  35, 232],
 [ 70,  70,  70],
 [102, 102, 156],
 [190, 153, 153],
 [153, 153, 153],
 [250, 170,  30],
 [220, 220,   0],
 [107, 142,  35],
 [152, 251, 152],
 [ 70, 130, 180],
 [220,  20,  60],
 [255,   0,   0],
 [  0,   0, 142],
 [  0,   0,  70],
 [  0,  60, 100],
 [  0,  80, 100],
 [  0,   0, 230],
 [119,  11,  32],
 [  0,   0,   0],
 [255, 255, 255]], np.uint8)

semantic_map = []
for colour in palette:
  class_map = tf.reduce_all(tf.equal(img, colour), axis=-1)
  semantic_map.append(class_map)
semantic_map = tf.stack(semantic_map, axis=-1)
# NOTE cast to tf.float32 because most neural networks operate in float32.
semantic_map = tf.cast(semantic_map, tf.float32)
magic_number = tf.reduce_sum(semantic_map)
print semantic_map.shape

palette = tf.constant(palette, dtype=tf.uint8)
class_indexes = tf.argmax(semantic_map, axis=-1)
# NOTE this operation flattens class_indexes
class_indexes = tf.reshape(class_indexes, [-1])
color_image = tf.gather(palette, class_indexes)
color_image = tf.reshape(color_image, [1024, 2048, 3])

sess = tf.Session()
# NOTE magic_number checks that there are only 1024*2048 1s in the entire
# 1024*2048*21 tensor.
magic_number_val = sess.run(magic_number)
assert magic_number_val == 1024*2048
color_image_val = sess.run(color_image)
scipy.misc.imsave('test.png', color_image_val)
jkschin
  • 5,776
  • 6
  • 35
  • 62
  • 1
    You can speed up your code at 'color_image' I think with tf.gather function (https://www.tensorflow.org/api_docs/python/tf/gather) : tf.gather(palette, class_indexes) and then reshape as you did. – Anthony D'Amato Oct 24 '17 at 07:06
  • I've made the changes and tested it. It is indeed faster. Thanks for the suggestion! – jkschin Oct 24 '17 at 07:25
  • Can you share the image you are using like this I can test other things? :) – Anthony D'Amato Oct 24 '17 at 07:36
  • I can't share it according to the license. You have to download it from Cityscapes. – jkschin Oct 24 '17 at 08:00
  • So what you really want to do is, from this image [H, W, 3] create a one hot matrix [H, W, 21], right ? – Anthony D'Amato Oct 25 '17 at 01:12
  • If it is the case you can try this with your dimensions: equality = tf.equal(tf.reshape(img,[W, H, 1, 3] ), tf.reshape(palette, [21,3])) equality = tf.cast(tf.reduce_all(equality, axis=-1), tf.int32) The first line is gonna check the equality with each color, then if there is a [True, True, True] on a color, the second is gonna put a 1 for this color (0 for the others, since they are False). – Anthony D'Amato Oct 25 '17 at 01:18
  • did you try this solution ? – Anthony D'Amato Oct 30 '17 at 04:41