2

I want to recolor a png image (which has handwritten text using a tablet) in a desired way (based on some dictionary [See color_dict in the code below] which dictates which color should be replaced by which ones).

It was easy to write a code which would ask the color of each pixel and recolor the pixel according to the dictionary.

But the output image ended up being pixelated (that, is, the text has jagged boundaries).

Upon some googling I found that if one changes the colors using a linear function based on the RGB values then jaggedness can be avoided.

A linear function ended up being inadequate for my purpose.

So I resorted to using a quadratic function based on RGB values (one degree 2 polynomial in three variables for each color channel) as given in the following code.

The problem is that it takes about 7 seconds to process one image, which is too high for my purpose.

Even with multiprocessing the run time is high for my needs (which is to recolor a video, I tried the geq filter in ffmpeg but that also led to jaggedness in the output, even when inverting colors).

Is there some other way to recolor the image (with the recipe of recoloring the same)?

Is there an advantage in using non-command line tools for this purpose?

from PIL import Image
import numpy as np


def return_row(r, g, b):
    r_inv = 255 - r
    g_inv = 255 - g
    b_inv = 255 - b
    return [r_inv**2, g_inv**2, b_inv**2, r_inv * g_inv, g_inv * b_inv, b_inv * r_inv, r_inv, g_inv, b_inv]


def solve_mat(dictionary):
    A = []
    B = []
    for key, color_code in dictionary.items():
        r = color_code[0]
        g = color_code[1]
        b = color_code[2]
        value = color_code[3]
        row = return_row(r, g, b)
        A.append(row)
        B.append(value)

    X = np.linalg.lstsq(A, B, rcond=None)[0]
    return X


def get_individual_channgel_dict(color_change_dict):
    r_dict = {}
    g_dict = {}
    b_dict = {}
    for color, array in color_change_dict.items():
        r_dict[color] = array[:3]
        r_dict[color].append(array[3])

        g_dict[color] = array[:3]
        g_dict[color].append(array[4])

        b_dict[color] = array[:3]
        b_dict[color].append(array[5])

    return r_dict, g_dict, b_dict


def get_coeff_mat(r_dict, g_dict, b_dict):
    r_mat = solve_mat(r_dict)
    g_mat = solve_mat(g_dict)
    b_mat = solve_mat(b_dict)

    return r_mat, g_mat, b_mat


def rgb_out(R, G, B, param_mat):
    a = param_mat[0]
    b = param_mat[1]
    c = param_mat[2]
    d = param_mat[3]
    e = param_mat[4]
    f = param_mat[5]
    g = param_mat[6]
    h = param_mat[7]
    i = param_mat[8]

    return int((a * (R**2)) + (b * (G**2)) + (c * (B**2)) + (d * R * G) + (e * G * B) + (f * B * R) + (g * R) + (h * G) + (i * B))


def change_colors_in_one_image(image_path):
    with Image.open(image_path) as img:
        pixels = img.load()

        # Iterate over each pixel
        width, height = img.size

        for x in range(width):

            for y in range(height):
                r, g, b = pixels[x, y]

                new_r = rgb_out(r, g, b, r_mat)

                new_g = rgb_out(r, g, b, g_mat)

                new_b = rgb_out(r, g, b, b_mat)

                pixels[x, y] = (new_r, new_g, new_b)

        # Save the modified image
        img.save(image_path)

color_change_dict = {
    "white": [5, 98, 255, 240, 240, 240],
    "black": [255, 255, 255, 81, 92, 93],
    "red": [207, 54, 108, 52, 152, 219],
    "mud": [203, 103, 14, 203, 103, 14],
}

r_dict, g_dict, b_dict = get_individual_channgel_dict(color_change_dict)
r_mat, g_mat, b_mat = get_coeff_mat(r_dict, g_dict, b_dict)
change_colors_in_one_image(imge_path)

As an example, following is an input image. enter image description here Following is the output after recoloring. enter image description here

  • Maybe check out the code in this answer https://stackoverflow.com/questions/57177649/how-can-i-quickly-change-pixels-in-a-image-from-a-color-dictionary – Yisroel Tech Aug 04 '23 at 13:08
  • Please click [edit] and add representative input and output images - both current and expected - so we know what you are referring to. Thank you. – Mark Setchell Aug 04 '23 at 14:08
  • @MarkSetchell I am getting satisfactory output. It's the run time that is bothering. I will add one sample any way. – caffeinemachine Aug 04 '23 at 14:32
  • 1
    I would be inclined to convert your image to HSV colourspace, then use `cv2.inRange()` to find a mask of pixels matching each of your input colours, then do a *"hue rotation"* to the desired output colour. – Mark Setchell Aug 04 '23 at 15:42
  • @MarkSetchell I am very new to image manipulation. Would be greatly helpful if you could give some details in an answer. Thank you. – caffeinemachine Aug 04 '23 at 17:58
  • I'm short on time today, but you make a mask in HSV space like this https://stackoverflow.com/a/50215020/2836621 Then you rotate hues like this https://stackoverflow.com/a/73956747/2836621 although that uses PIL but you just need to calculate the hue angle of your input colour and the hue angle of the output colour then add the difference to all your pixels addressed by the mask. I'll try and make time over the weekend... – Mark Setchell Aug 04 '23 at 18:49

1 Answers1

2

I'd suggest using vectorization to do this image transformation, rather than doing it pixel-by-pixel.

def rgb_out(R, G, B, param_mat):
    a = param_mat[0]
    b = param_mat[1]
    c = param_mat[2]
    d = param_mat[3]
    e = param_mat[4]
    f = param_mat[5]
    g = param_mat[6]
    h = param_mat[7]
    i = param_mat[8]
    val = ((a * (R**2)) + (b * (G**2)) + (c * (B**2)) + (d * R * G) + (e * G * B) + (f * B * R) + (g * R) + (h * G) + (i * B))
    # Prevent overflow - if larger than 255 or smaller than 0, clip to those values
    return val.clip(0, 255).astype('uint8')


def change_colors_in_one_image(image_path):
    with Image.open(image_path) as img:
        # Convert array to numpy
        pixels = np.array(img)
    # Put channel axis first
    pixels = np.moveaxis(pixels, -1, 0)
    # Avoid overflow in intermediate calculations
    pixels = pixels.astype('float32')
    r, g, b = pixels
    new_r = rgb_out(r, g, b, r_mat)
    new_g = rgb_out(r, g, b, g_mat)
    new_b = rgb_out(r, g, b, b_mat)
    pixels = np.stack([new_r, new_g, new_b])
    # Move channel axis back to last
    pixels = np.moveaxis(pixels, 0, -1)
    img = Image.fromarray(pixels)
    return img

Timing this, it takes 372ms per frame, which is about 100x faster.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66