I have built a pixel classifier for images, and for each pixel in the image, I want to define to which pre-defined color cluster it belongs. It works, but at some 5 minutes per image, I think I am doing something unpythonic that can for sure be optimized.
How can we map the function directly over the list of lists?
#First I convert my image to a list
#Below list represents a true image size
list1=[[255, 114, 70],
[120, 89, 15],
[247, 190, 6],
[41, 38, 37],
[102, 102, 10],
[255,255,255]]*3583180
Then we define the clusters to map the colors to and the function to do so (which is taken from the PIL library)
#Define colors of interest
#Colors of interest
RED=[255, 114, 70]
DARK_YELLOW=[120, 89, 15]
LIGHT_YELLOW=[247, 190, 6]
BLACK=[41, 38, 37]
GREY=[102, 102, 10]
WHITE=[255,255,255]
Colors=[RED, DARK_YELLOW, LIGHT_YELLOW, GREY, BLACK, WHITE]
#Function to find closes cluster by root and squareroot distance of RGB
def distance(c1, c2):
(r1,g1,b1) = c1
(r2,g2,b2) = c2
return math.sqrt((r1 - r2)**2 + (g1 - g2) ** 2 + (b1 - b2) **2)
What remains is to match every color, and make a new list with matched indexes from the original Colors:
Filt_lab=[]
#Match colors and make new list with indexed colors
for pixel in tqdm(list1):
closest_colors = sorted(Colors, key=lambda color: distance(color, pixel))
closest_color = closest_colors[0]
for num, clust in enumerate(Colors):
if list(clust) == list(closest_color):
Filt_lab.append(num)
Running a single image takes approximately 5 minutes, which is OK, but likely there is a method in which this time can be greatly reduced?
36%|███▌ | 7691707/21499080 [01:50<03:18, 69721.86it/s]
Expected outcome of Filt_lab:
[0, 1, 2, 4, 3, 5]*3583180