How to map pixels (R, G, B) in a collection of images to a distinct pixel-color-value indices?

Question

Lets say one has 600 annotated semantic segmentation mask images, which contain 10 different colors, each representing one entity. These images are in a numpy array of shape (600, 3, 72, 96), where n = 600, 3 = RGB channels, 72 = height, 96 = width.

How to map each RGB-pixel in the numpy array to a color-index-value? For example, a color list would be [(128, 128, 0), (240, 128, 0), ...n], and all (240, 128, 0) pixels in the numpy array would be converted to index value in unique mapping (= 1).

How to do this efficiently and with less code? Here's one solution I came up with, but it's quite slow.

# Input imgs.shape = (N, 3, H, W), where (N = count, W = width, H = height)
def unique_map_pixels(imgs):
  original_shape = imgs.shape

  # imgs.shape = (N, H, W, 3)
  imgs = imgs.transpose(0, 2, 3, 1)

  # tupleview.shape = (N, H, W, 1); contains tuples [(R, G, B), (R, G, B)]
  tupleview = imgs.reshape(-1, 3).view(imgs.dtype.descr * imgs.shape[3])

  # get unique pixel values in images, [(R, G, B), ...]
  uniques = list(np.unique(tupleview))

  # map uniques into hashed list ({"RXBXG": 0, "RXBXG": 1}, ...)
  uniqmap = {}
  idx = 0
  for x in uniques:
    uniqmap["%sX%sX%s" % (x[0], x[1], x[2])] = idx
    idx = idx + 1
    if idx >= np.iinfo(np.uint16).max:
      raise Exception("Can handle only %s distinct colors" % np.iinfo(np.uint16).max)

  # imgs1d.shape = (N), contains RGB tuples
  imgs1d = tupleview.reshape(np.prod(tupleview.shape))

  # imgsmapped.shape = (N), contains uniques-index values
  imgsmapped = np.empty((len(imgs1d))).astype(np.uint16)

  # map each pixel into unique-pixel-ID
  idx = 0
  for x in imgs1d:
    str = ("%sX%sX%s" % (x[0], x[1] ,x[2]))
    imgsmapped[idx] = uniqmap[str]
    idx = idx + 1

  imgsmapped.shape = (original_shape[0], original_shape[2], original_shape[3]) # (N, H, W)
  return (imgsmapped, uniques)

Testing it:

import numpy as np
n = 30
pixelvalues = (np.random.rand(10)*255).astype(np.uint8)
images = np.random.choice(pixelvalues, (n, 3, 72, 96))

(mapped, pixelmap) = unique_map_pixels(images)
assert len(pixelmap) == mapped.max()+1
assert mapped.shape == (len(images), images.shape[2], images.shape[3])
assert pixelmap[mapped[int(n*0.5)][60][81]][0] == images[int(n*0.5)][0][60][81]
print("Done: %s" % list(mapped.shape))

hmm. Why do you want to do this? Seems that it is just adding a step for no reason. If you ever want to do anything with those color indices, you're going to have to search the dict and turn them back into RGB tuples, no? Edit: Nevermind, I see. If you're storing a bunch of images it more of efficient to store ints instead of a bunch of tuples, since you anticipate a certain number of colors anyway (10), correct? — pretzlstyle, Aug 16 '16 at 18:28
Yes, the amount of colors is limited. Need the unique indices because I'm feeding the pixels to algorithm for predicting pixel-categories, not pixel-colors. Greyscale images (with intensity e.g. 0-10) would do also, but then the images wouldn't be easily visualizable by standard tools (= image viewers, editors, etc). In the end, after prediction, need to map back to RGB-values, yes. — Mika Vatanen, Aug 16 '16 at 19:02

score 5 · Accepted Answer · edited Aug 18 '16 at 18:30

Here's a compact vectorized approach without those error checks -

def unique_map_pixels_vectorized(imgs):
    N,H,W = len(imgs), imgs.shape[2], imgs.shape[3]
    img2D = imgs.transpose(0, 2, 3, 1).reshape(-1,3)
    ID = np.ravel_multi_index(img2D.T,img2D.max(0)+1)
    _, firstidx, tags = np.unique(ID,return_index=True,return_inverse=True)
    return tags.reshape(N,H,W), img2D[firstidx]

Runtime test and verification -

In [24]: # Setup inputs (3x smaller than original ones)
    ...: N,H,W = 200,24,32
    ...: imgs = np.random.randint(0,10,(N,3,H,W))
    ...: 

In [25]: %timeit unique_map_pixels(imgs)
1 loop, best of 3: 2.21 s per loop

In [26]: %timeit unique_map_pixels_vectorized(imgs)
10 loops, best of 3: 37 ms per loop ## 60x speedup!

In [27]: map1,unq1 = unique_map_pixels(imgs)
    ...: map2,unq2 = unique_map_pixels_vectorized(imgs)
    ...: 

In [28]: np.allclose(map1,map2)
Out[28]: True

In [29]: np.allclose(np.array(map(list,unq1)),unq2)
Out[29]: True

60x indeed, preprocessing time for one dataset reduced from 4h to 4minutes :) Thank you! Documentation for ravel_multi_index is very sparse, can't really understand what does it do. What does max pixel value as a starting point mean? As I understand, it somehow compresses these 3-elem arrays into unique int-representations, but how and what do those (large) integers represent? — Mika Vatanen, Aug 17 '16 at 07:12
@Mika Ah you are in total luck! Check out this post : http://stackoverflow.com/a/38674038/3293881 — Divakar, Aug 17 '16 at 08:42

Rahul kumar · Answer 2 · 2019-12-07T10:17:06.430

I have an image of 3 channels. I have pixel values of 3 channels that if a pixel has these 3 values in its 3 channels then it belongs to class 'A'. Basically I want to generate an array of channels equal to number of classes with each class separate in a particular channel. This can be done

seg_channel = np.zeros((image.shape[0], image.shape[1], num_classes))
pixel_class_dict={'1': [128, 64, 128]. '2': [230, 50, 140]} #num_classes=2
for channel in range(num_classes):
    pixel_value= pixel_class_dict[str(channel)]
    for i in range(image.shape[0]):
        for j in range(image.shape[1]):
            if list(image[i][j])==pixel_value:
                classes_channel[i,j,channel]=1

There is another way also to do this efficiently

import numpy as np
import cv2
for class_id in self.pixel_class_dict:
      class_color = np.array(self.pixel_class_dict:[class_id])
      seg_channel[:, :, class_id] = cv2.inRange(mask, class_color, class_color).astype('bool').astype('float32')

score 0 · Answer 3 · answered Jun 03 '20 at 18:10

This is what I do:

def rgb2mask(img): 
    if img.shape[0] == 3:
       img = img.rollaxis(img, 0, 3) 

    W = np.power(256, [[0],[1],[2]])

    img_id = img.dot(W).squeeze(-1) 
    values = np.unique(img_id)

    mask = np.zeros(img_id.shape)
    cmap = {}

    for i, c in enumerate(values):
        idx = img_id==c
        mask[idx] = i 
        cmap[tuple(img[idx][0])] = i
    return mask, cmap

If you want to map values according to an already existing dictionary, check out my answer on this thread: Convert RGB image to index image

How to map pixels (R, G, B) in a collection of images to a distinct pixel-color-value indices?

3 Answers3