I have a NumPy image in RGB bytes, let's say it's this 2x3 image:
img = np.array([[[ 0, 255, 0], [255, 255, 255]],
[[255, 0, 255], [ 0, 255, 255]],
[[255, 0, 255], [ 0, 0, 0]]])
I also have a palette that covers every color used in the image. Let's say it's this palette:
palette = np.array([[255, 0, 255],
[ 0, 255, 0],
[ 0, 255, 255],
[ 0, 0, 0],
[255, 255, 255]])
Is there some combination of indexing the image against the palette (or vice versa) that will give me a paletted image equivalent to this?
img_p = np.array([[1, 4],
[0, 2],
[0, 3]])
For comparison, I know the reverse is pretty simple. palette[img_p]
will give a result equivalent to img
. I'm trying to figure out if there's a similar approach in the opposite direction that will let NumPy do all the heavy lifting.
I know I can just iterate over all the image pixels individually and build my own paletted image. I'm hoping there's a more elegant option.
Okay, so I implemented the various solutions below and ran them over a moderate test set: 20 images, each one 2000x2000 pixels, with a 32-element palette of three-byte colors. Pixels were given random palette indexes. All algorithms were run over the same images.
Timing results:
- mostly empty lookup array - 0.89 seconds
- np.searchsorted approach - 3.20 seconds
- Pandas lookup, single integer - 38.7 seconds
- Using == and then aggregating the boolean results - 66.4 seconds
- inverting the palette into a dict and using np.apply_along_axis() - Probably ~500 seconds, based on a smaller test set
- Pandas lookup with a MultiIndex - Probably ~3000 seconds, based on a smaller test set
Given that the lookup array has a significant memory penalty (and a prohibitive one if there's an alpha channel), I'm going to go with the np.searchsorted
approach. The lookup array is significantly faster if you want to spend the RAM on it.