I've been working on this problem for some time now with little promising results. I am trying to split up an image into connected regions of similar color. (basically split a list of all the pixels into multiple groups (each group containing the coordinates of the pixels that belong to it and share a similar color).
For example: http://unsplash.com/photos/SoC1ex6sI4w/
In this image the dark clouds at the top would probably fall into one group. Some of the grey rock on the mountain in another, and some of the orange grass in another. The snow would be another - the red of the backpack - etc.
I'm trying to design an algorithm that will be both accurate and efficient (it needs to run in a matter of ms on midrange laptop grade hardware)
Below is what I have tried:
Using a connected component based algorithm to go through every pixel from top left scanning every line of pixels from left to right (and comparing the current pixel to the top pixel and left pixel). Using the CIEDE2000 color difference formula if the pixel at the top or left was within a certain range then it would be considered "similar" and part of the group.
This sort of worked - but the problem is it relies on color regions having sharp edges - if any color groups are connected by a soft gradient it will travel down that gradient and continue to "join" the pixels as the difference between the individual pixels being compared is small enough to be considered "similar".
To try to fix this I chose to set every visited pixel's color to the color of most "similar" adjacent pixel (either top or left). If there are no similar pixels than it retains it's original color. This somewhat fixes the issue of more blurred boundaries or soft edges because the first color of a new group will be "carried" along as the algorithm progresses and eventually the difference between that color and the current compared color will exceed the "similarity" threashold and no longer be part of that group.
Hopefully this is making sense. The problem is neither of these options are really working. On the image above what is returned are not clean groups but noisy fragmented groups that is not what I am looking for.
I'm not looking for code specifically - but more ideas as to how an algorithm could be structured to successfully combat this problem. Does anyone have ideas about this?
Thanks!