3

My goal is to convert a list of pixels from RGB to Hex as quickly as possible. The format is a Numpy dimensional array (rgb colorspace) and ideally it would be converted from RGB to Hex and maintain it's shape.

My attempt at doing this uses list comprehension and with the exception of performance, it solves it. Performance wise, adding the ravel and list comprehension really slows this down. Unfortunately I just don't know enough math to know the solution of how to to speed this up:

Edited: Updated code to reflex most recent changes. Current running around 24ms on 35,000 pixel image.

def np_array_to_hex(array):
    array = np.asarray(array, dtype='uint32')
    array = (1 << 24) + ((array[:, :, 0]<<16) + (array[:, :, 1]<<8) + array[:, :, 2])
    return [hex(x)[-6:] for x in array.ravel()]

>>> np_array_to_hex(img)
['afb3bc', 'abaeb5', 'b3b4b9', ..., '8b9dab', '92a4b2', '9caebc']
stwhite
  • 3,156
  • 4
  • 37
  • 70
  • https://docs.python.org/3/library/functions.html#hex is the Python `hex` function. It returns a string that starts with `0x`. `int(..., 16)` converts it back to integer. There isn't a corresponding `numpy` functionality (that I know of). You could apply this function to each element of your array. – hpaulj Aug 15 '19 at 04:25
  • @hpaulj yep, as you can see in the question I am already using that. My question is more so about applying that function across the numpy array operations. – stwhite Aug 15 '19 at 04:30
  • `np.frompyfunc(hex,1,1)(arr) `. Another is `np.frompyfunc('0x{:07X}'.format,1,1)(arr) ` – hpaulj Aug 15 '19 at 04:33
  • @stwhite you could explore the answers here to see if any of them will get you the rest of the way to a solution: https://stackoverflow.com/questions/3380726/converting-a-rgb-color-tuple-to-a-six-digit-code-in-python/43572620#43572620 –  Aug 15 '19 at 04:48
  • Can I ask why you want to do this please? It seems likely that you want it in hex for human consumption/analysis, so it seems unlikely a human would notice whether something is ready in 23ms or 48ms... how big is your array by the way and how long does your fastest method take? – Mark Setchell Aug 15 '19 at 09:23
  • @MarkSetchell this is used in extracting colors from around 1 million images in an offline process. So while it's offline, timing is still important. The most recent code is running at 24ms using this code (updated question with most recent working code). The images are downsized to around 200px which leaves somewhere around around 40k pixels (varies based on image size). You do have a good point in that I am converting to a human-readable 6-digit hex format, but I may not need to do that to the full 35k list because it gets de-duplicated right after this. – stwhite Aug 15 '19 at 16:47

1 Answers1

0

I tried it with a LUT ("Look Up Table") - it takes a few seconds to initialise and it uses 100MB (0.1GB) of RAM, but that's a small price to pay amortised over a million images:

#!/usr/bin/env python3

import numpy as np

def np_array_to_hex1(array):
    array = np.asarray(array, dtype='uint32')
    array = ((array[:, :, 0]<<16) + (array[:, :, 1]<<8) + array[:, :, 2])
    return array

def np_array_to_hex2(array):
    array = np.asarray(array, dtype='uint32')
    array = (1 << 24) + ((array[:, :, 0]<<16) + (array[:, :, 1]<<8) + array[:, :, 2])
    return [hex(x)[-6:] for x in array.ravel()]

def me(array, LUT):
    h, w, d = array.shape
    # Reshape to a color vector
    z = np.reshape(array,(-1,3))
    # Make array and fill with 32-bit colour number
    y = np.zeros((h*w),dtype=np.uint32) 
    y = z[:,0]*65536 + z[:,1]*256 + z[:,2] 
    return LUT[y] 

# Define dummy image of 35,000 RGB pixels
w,h = 175, 200
im = np.random.randint(0,256,(h,w,3),dtype=np.uint8)

# %timeit np_array_to_hex1(im)
# 112 µs ± 1.1 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

# %timeit np_array_to_hex2(im)
# 8.42 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# This may take time to set up, but amortize that over a million images...
LUT = np.zeros((256*256*256),dtype='a6') 
for i in range(256*256*256): 
    h = hex(i)[2:].zfill(6)
    LUT[i] = h

# %timeit me(im,LUT)
# 499 µs ± 8.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So that appears to be 4x slower than your fastest which doesn't work, and 17x faster that your slowest which does work.

My next suggestion is to use multi-threading or multi-processing so all your CPU cores get busy in parallel and reduce your overall time by a factor of 4 or more assuming you have a reasonably modern 4+ core CPU.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • Thanks for your answer Mark! Question though... what doesn't work about my last example in the question? The first example (`np_array_to_hex1`) is missing `(1 << 24)` so it remains an int instead of hex, though the second example fixes that. – stwhite Aug 16 '19 at 20:26
  • I don't understand what you mean. Your first method takes 0.12 milliseconds on my machine but doesn't work. Your second method does work but takes 8 milliseconds which you say is too slow, so I proposed a method that takes 0.5 milliseconds. I thought your question was about improving the 8 milliseconds, not debugging it... – Mark Setchell Aug 16 '19 at 22:13
  • Disregard method 1. There should have been only one method—I've removed the confusion from the question. I'll have to test your code in my setup because it's not just one giant loop over the images. Thanks for your answer! – stwhite Aug 16 '19 at 23:15