Simply put, what I'm trying to do is similar to this question: Convert RGB image to index image, but instead of 1-channel index image, I want to get n-channel image where img[h, w]
is a one-hot encoded vector. For example, if the input image is [[[0, 0, 0], [255, 255, 255]]
, and index 0 is assigned to black and 1 is assigned to white, then the desired output is [[[1, 0], [0, 1]]]
.
Like the previous person asked the question, I have implemented this naively, but the code runs quite slowly, and I believe a proper solution using numpy would be significantly faster.
Also, as suggested in the previous post, I can preprocess each image into grayscale and one-hot encode the image, but I want a more general solution.
Example
Say I want to assign white to 0, red to 1, blue to 2, and yellow to 3:
(255, 255, 255): 0
(255, 0, 0): 1
(0, 0, 255): 2
(255, 255, 0): 3
, and I have an image which consists of those four colors, where image is a 3D array containing R, G, B values for each pixel:
[
[[255, 255, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
[[ 0, 0, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
[[ 0, 0, 255], [ 0, 0, 255], [255, 255, 255], [255, 255, 255]],
[[255, 255, 255], [255, 255, 255], [255, 255, 0], [255, 255, 0]]
]
, and this is what I want to get where each pixel is changed to one-hot encoded values of index. (Since changing a 2d array of index values to 3d array of one-hot encoded values is easy, getting a 2d array of index values is fine too.)
[
[[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
[[0, 0, 1, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
[[0, 0, 1, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0]],
[[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]]
]
In this example I used colors where RGB components are either 255 or 0, but I don't want to solutions rely on that fact.