How do I create/replicate this dominant colors function?

Question

Currently I am using python to determine the dominant colors of an image (function adapted from https://stackoverflow.com/a/3244061/7274182)

def dominant_colors(image):
    """
    Adaptation of https://stackoverflow.com/a/3244061/7274182
    """
    ar = numpy.asarray(image.resize((150, 150), 0))
    shape = ar.shape
    ar = ar.reshape(numpy.product(shape[:2]), shape[2]).astype(float)

    kmeans = sklearn.cluster.MiniBatchKMeans(
        n_clusters=10, init="k-means++", max_iter=20, random_state=1000
    ).fit(ar)
    codes = kmeans.cluster_centers_

    vecs, _dist = scipy.cluster.vq.vq(ar, codes)  # assign codes
    counts, _bins = numpy.histogram(vecs, len(codes))  # count occurrences

    colors = []
    for index in numpy.argsort(counts)[::-1]:
        color_tuple = tuple([int(code) for code in codes[index]])
        colors.append(color_tuple)
    return colors  # returns colors in order of dominance

I wanted to port this code to rust so I initially tried color-thief-rs although the output was sometimes wrong (i.e. clearly red image outputting grey color) and it seemingly has no nice way to edit the kmeans parameters to allow me to fix it.

Are there other crates that can give me similar results to this function and allow me to change the KMeans parameters to my liking?

N.B. Even though the original version uses sklearn.cluster.MiniBatchKMeans either MiniBatch or standard Kmeans will suffice.

score 2 · Answer 1 · answered Sep 03 '22 at 08:19

It's been a while since I last used python, but I think the following should be close to what you are searching for:

use image::GenericImageView;
use kmeans_colors::Sort;
use kmeans_colors::{get_kmeans, Kmeans};
use palette::{FromColor, IntoColor, Lab, Pixel, Srgb};

fn main() {
    let pixels = read_image("path/to/your/image.png").unwrap();
    let output = dominant_colors(&pixels);

    // Print output for demonstration purposes.
    for c in output.iter() {
        println!("{}", c);
    }
}

/// Finds the colors in pixel data in the order of dominance.
///
/// Arguments:
///
/// - `pixels`: RGB data with 8-byte per channel.
///
/// Returns:
///
/// A list of HEX strings sorted in the order of dominance.
fn dominant_colors(pixels: &[u8]) -> Vec<String> {
    // Convert RGB [u8] buffer to Lab for k-means.
    let lab: Vec<Lab> = Srgb::from_raw_slice(&pixels)
        .iter()
        .map(|x| x.into_format().into_color())
        .collect();

    // Iterate over the runs, keep the best results.
    let mut result = Kmeans::new();
    for i in 0..3 {
        let run_result = get_kmeans(10, 20, 0.0, false, &lab, 1000 + i as u64);
        if run_result.score < result.score {
            result = run_result;
        }
    }

    // Process centroid data.
    let mut res = Lab::sort_indexed_colors(&result.centroids, &result.indices);

    // Sort indexed colors by percentage.
    res.sort_unstable_by(|a, b| {
        (b.percentage)
            .partial_cmp(&a.percentage)
            .expect("Failed to compare values while sorting.")
    });

    // Uncomment to print RGB values and percentage.
    /*for r in res.iter() {
        let c: Srgb<u8> = Srgb::from_color(r.centroid).into_format();
        println!(
            "{} {} {} : {:.2}%",
            c.red,
            c.green,
            c.blue,
            r.percentage * 100f32
        );
    }*/

    // Format colors as RGB HEX string.
    let mut hex_colors = Vec::new();
    for r in res.iter() {
        let c: Srgb<u8> = Srgb::from_color(r.centroid).into_format();
        let hex_str = format!("#{:02x}{:02x}{:02x}", c.red, c.green, c.blue);
        hex_colors.push(hex_str);
    }
    hex_colors
}

/// Reads an image from the given path to a list of color values per channel and pixel.
///
/// Arguments:
///
/// - `path`: The path to the image.
///
/// Returns:
///
/// A [Result] that holds the list of color values if successful.
fn read_image(path: &str) -> Result<Vec<u8>, Box<dyn std::error::Error>> {
    let img = image::open(path)?;
    img.resize(150, 150, image::imageops::Nearest);

    let pixels = img
        .pixels()
        .map(|p| [p.2 .0[0], p.2 .0[1], p.2 .0[2]])
        .flatten()
        .collect();

    Ok(pixels)
}

I used these dependencies:

[dependencies]
image = "0.24.3"
kmeans_colors = "0.5.0"
palette = "0.6.1"

It relies on kmeans_colors instead of scikit to do the heavy-lifting. The rest of the code is just some conversions to get an output close enough to what your python code would deliver.

score -2 · Answer 2 · answered Sep 07 '22 at 20:20

1.- I have repeated dominant colour(s) search on different test images with your Python script, with the Python script you mention as your reference and with the following (slightly modified) MATLAB script freely available from the Image Manipulation Toolbox

https://uk.mathworks.com/matlabcentral/fileexchange/53786-image-manipulation-toolbox?s_tid=srchtitle

test script

% rgbpict = imread('A10_02.jpg');
rgbpict = imread('test_im_halifax.jpg');
% get some different image stats
cc = imstats(rgbpict,'mean','median','mode','modecolor','modefuzzy','moderange','nmost',10);
% use those stats to construct a swatch chart for single-output stats
labels = {'mean','median','mode','modecolor','modefuzzy'};
sz = imsize(rgbpict,2);
ntiles = (numel(labels));
tilesz = [round(sz(1)/ntiles) 100];
block1 = zeros([tilesz 3 ntiles],'uint8');
for k = 1:ntiles
    thistile = colorpict([tilesz 3],cc(k,:),'uint8'); % colored swatch
    thislabel = im2uint8(textim(labels{k},'ibm-iso-16x9')); % text label image
    thistile = im2uint8(imstacker({thislabel thistile},'padding',0)); % match geometry
    block1(:,:,:,k) = mergedown(thistile,1,'lineardodge'); % blend label and swatch
end
block1 = imtile(block1,[ntiles 1]); % vertically arrange tiles
block1 = imresize(block1,[sz(1) tilesz(2)]); % make sure it's the right size
 
% create another chart for moderange's multiple outputs
ntiles = (size(cc,1)-ntiles);
tilesz = [round(sz(1)/ntiles) 100];
block2 = zeros([tilesz 3 ntiles],'uint8');
for k = 1:ntiles
    thistile = colorpict([tilesz 3],cc(k+4,:),'uint8'); % colored swatch
    thislabel = im2uint8(textim(num2str(k),'ibm-iso-16x9')); % text label image
%     thistile = im2uint8(imstacker({thislabel thistile},'padding',0)); % match geometry
    block2(:,:,:,k) = mergedown(thistile,1,'lineardodge'); % blend label and swatch
end
block2 = imtile(block2,[ntiles 1]); % vertically arrange tiles
block2 = imresize(block2,[sz(1) tilesz(2)]); % make sure it's the right size

It's a free toolbox : I had to put literally everything in same folder to get it to work and the above test script only started working when following line was removed

%     thistile = im2uint8(imstacker({thislabel thistile},'padding',0)); % match geometry

this line attempts to resize the right hand side added palette with quantized top colours, but this attempt needs more work because as it is, for certain sizes of input images, the script doesn't work.

2.- using your Python script :

N   most frequent is 

1   [113.81217778 141.25035556 172.20262222] (#718dac)
2   [ 97.10606222 129.13072661 164.74840188] (#6181a4)
3   [ 98.46631408 134.08834666 173.51950237] (#6286ad)
4   [ 97.7581243  134.19380946 174.45242802] (#6186ae)
5   [ 96.56581855 133.67920955 174.60338405] (#6085ae)
6   [ 92.46806407 133.20328258 176.34605497] (#5c85b0)
7   [ 93.61373888 134.19164841 178.5884779 ] (#5d86b2)
8   [ 93.43950362 134.14128749 178.54007239] (#5d86b2)

3.- Now I have changed the test image to test_im_peppers.jpg that is used in different MATLAB Image Processing examples available in the Mathworks website I have plugged it to the script you mentioned as your reference.

Python - Find dominant/most common color in an image

Checking again with the reference script

N   Dominant
1   [121.17777778  66.76537778  57.6372]   (#794239)
2   [80.45327447 44.12917252 59.5902067]   (#502c3b)
3   [72.66094284 42.63534462 67.05970268]   (#482a43)
4   [72.94724511 40.75997738 67.46606883]   (#482843)
5   [71.48615917 40.57579502 66.24139067]   (#472842)
6   [67.95439536 37.52449799 61.03685855]   (#43253d)
7   [67.46886282 37.9381769  61.53898917]   (#43253d)
8   [67.38328912 36.55617658 63.51752558]   (#43243f)

4.- Now plugging test_im_peppers.jpg image to the following free online dominant colour app :

https://colorpalette.imageonline.co/#:~:text=Palette%20is%20generated%20using%20dominant%20colors%20of%20image,to%20generate%20color%20palette.%20How%20this%20tool%20working%3F

Checking again with the reference script N Dominant 1 #6F3735 .. 8 #E3BE5B

5.- Repeating for another free dominant colour(s) search app :

https://dominant-colors.com/ N Dominant ..

5.- and repeating with this other free dominant colour(s) app :

https://onlinejpgtools.com/find-dominant-jpg-colors

N   Dominant
1   
2   #542c3a
3   #4e2c3d
4   #4c2740
5   #462943
6   
7   
8

6.- I have all the resulting dominant colours and palettes ready to share if you tell me to

some of the test images I have used

7.- The following Springer Editorial book :

Dominant Color Palette Extraction by K-Means Clustering Algorithm and Reconstruction of Image

Authors: Illa Pavan Kumar, V. P. Hara Gopal, Somula Ramasubbareddy, Sravani Nalluri

https://link.springer.com/chapter/10.1007/978-981-15-1097-7_78

may contain all you need to solve this and many other questions related to dominant colours.

The Kindle version of this book is tagged at a whooping £159.50 and the paperback is £199.99. This can only mean any combination of the following : 7.1. really good book 7.2. book has been hacked and is already out there for free 7.3. no one buys it.

8.- I'd like to draw the attention on the following part of this Mathworks forum answer

https://uk.mathworks.com/matlabcentral/answers/65055-dominant-color-for-an-rgb-image?s_tid=srchtitle_dominant%20color_1

where the Image Manipulation Toolbox is mentioned

[QUOTE]

%      'mode' returns the mode (most common values) per channel

%      'modecolor' calculates the most common color.  This differs from 'mode' as 
%        the most frequent values in individual channels are not necessarily colocated. 
%        Consider an image which is 40% [1 0 0], 30% [1 1 1] and 30% [0 1 1].  
%        For this example, 'mode' returns [1 1 1], whereas 'modecolor' returns [1 0 0].  
%        The latter would be the intuitive answer.

%      'moderange' calculates a selected range of the most common colors in the image.  
%        Contrast this with 'modecolor' which calculates the singular most common color.
%        The range of colors and number of output tuples is specified by parameter 'nmost'.  
%        This mode supports only I/RGB images.

%      'modefuzzy' calculates the frequency-weighted mean of a selected range of the
%        most common colors in the image.  The range of colors is specified by parameter 
%        'nmost'.  This mode supports only I/RGB images.

%      The 'modecolor', 'modefuzzy', and 'moderange' options all do color quantization, 
%      and can therefore alter the color population to some degree.  Be wary of using 
%      the output of these modes for anything of technical importance.

[END OF QUOTE]

From this comment, as well as from the results shown above and by visually assessing the resulting palettes obtained with the peppers test image and other images, it seems dominant colour is obtained with function modecolor, not function mode.

These mode and modecolor are functions belonging to the MATLAB Image Manipulation Toolbox, that happens to be free and the MATLAB code for all functions is available, so for more detail on how modecolor captures dominant colours just open the rel

Therefore the dominant colour(s) is/are neither the mean value, nor the median, nor mode applied to single RGB layers.

I say it again, applying mode returns an RGB colour containing the most common OF EACH RGB channel, while modecolor returns the overall most common colour, which is the single or multiple dominant colour(s).

The closing comment of the quoted Mathworks forum mentions reads

' to be aware of modecolor quantization '

Obviously, replacing an image with its quantized version means introducing errors.

Because after all dominant colour search can be done with k-means clustering which is a way of quantizing images.

Quantizing signals means putting up with small errors.

9.- Note I have mention Dominant Colour(s) in possible plural.

This is because it may be the case that the image has more than one dominant colour, as it would be for instance a chess board where with exactly same amounts of white and black pixels.

Therefore as summary, we have seen that

• the Dominant Colour is the majority colour, after having quantized the image. Therefore there may not be a unique Dominant colour depening upon the quantizing method, or how many clusters sought.

• k-means clustering is one popular way of quantizing images.

• The MATLAB Image Manipulation Toolbox function modecolor works directly on overall to colours, not individual RGB layers, as it does function mode. Both functions often return dispare results.

• There may be more than one top dominant colour. In examples checkered flag or chess board with exactly same amount of black and white pixels.

• When looking for the Dominant Colour, a palette with multiple colours is usually useful to find the dominant colours below the top dominant colour(s). The palette colours may be referred also as dominant colours, but the term dominant colour should be applied to the top one only.

If you find this answer useful, would you please consider tagging it as accepted answer?

Thanks for reading my answer

How do I create/replicate this dominant colors function?

2 Answers2

Linked