Image Classification - Detecting an image is cartoon-like

Question

I have a large amount of JPEG thumbnail images ranging in size from 120x90 to 320x240 and I would like to classify them as either Real Life-like or Cartoon-like.

How might one do this using ImageMagick's utilities: convert, compare, identify? Or are there other programs out there that will do the trick?

If your are searching for a premade application, this belongs on superuser. — John Gietzen, Oct 05 '09 at 04:54
I'm more interested in learning the process of determining how an image can be classified as cartoon-like. That being said, if an existing application exists for this, then I would also like to know. — kingb, Oct 05 '09 at 04:57

score 16 · Answer 1 · edited Jan 16 '10 at 16:17

I guess your best bit is the radio between histogram and number of pixel. A cartoon-line image trend to have less number of color then the real-life one.

You can use

COLORS=`convert picture.jpg  -format %c histogram:info:- | wc -l`

to count how many colors the picture have. And use a command like:

WIDTH=`jpeginfo picture.jpg | sed -r "s/.* ([0-9]+) x.*/\1/"`

and

HEIGHT=`jpeginfo picture.jpg | sed -r 's/.*x ([0-9]+)  .*/\1/'`

to extract width and height.

Then use this command to find the ratio:

echo $WIDTH $HEIGHT $COLORS | awk '{ print $3/($1 * $2);}'

Then it is up to you to define what ratio is qualified as cartoon-like and what is not. For Cartoon-like, the ratio is mostly lower than the real-life one.

Just a thought.

EDIT: I just saw your comment that you don't want to know how just an exiting one. So just ignore my answer then.

EDIT 2: I modify it a bit to make it easier to see.

NOTE 1: You should notice that I swap the ratio as the number of pixels is always much bigger than the number of colors so the previous program results in a lower number. That is why you can hardly distinguish them.

NOTE2: I also change from "jpeginfo" to "identity" as jpeginfo can only do jpg and it is not a part of ImageMagick.

~/test/CheckCartoon.sh

#!/bin/sh

IMAGE=$1
COLORS=convert $IMAGE -format %c histogram:info:- | wc -l
WIDTH=<b>identify</b> $IMAGE | sed -r "s/.* ([0-9]+)x[0-9]+ .*/\1/"
HEIGHT=<b>identify</b> $IMAGE | sed -r 's/.* [0-9]+x([0-9]+) .*/\1/'
RATIO=echo $WIDTH $HEIGHT $COLORS | awk '{ print <b>($1 * $2)/$3</b>;}'
echo $RATIO  | awk '{ printf "%020.5f",$1 }'

~/test/CheckAll.sh

#!/bin/sh

cd images
FILES=ls
for FILE in $FILES; do
    IsIMAGE=identify $FILE 2>&1 | grep " no decode delegate " | grep -o "no"
    if [ "$IsIMAGE" = "no" ]; then continue; fi

IsIMAGE=`identify $FILE 2>&1 | grep " Improper image header " | grep -o "Improper"`
if [ "$IsIMAGE" = "Improper" ]; then continue; fi

echo `.././CheckCartoon.sh $FILE` $FILE


done

cd ..

Now for testing you copy files here.

Pic 1: ~/test/images/Cartoon-01.jpg

Pic 2: ~/test/images/Cartoon-02.png

Pic 3: ~/test/images/Cartoon-03.gif

Pic 4: ~/test/images/Real-01.jpg

Pic 5: ~/test/images/Real-02.jpg

Pic 6: ~/test/images/Real-03.jpg

http://dl.getdropbox.com/u/1961549/StackOverflow/SO1518347/Images.png

Then I run ./CheckAll.sh | sort (in test folder). Here is want I got.

00000000000003.31362 Real-03.jpg
00000000000004.61574 Real-02.jpg
00000000000009.89920 Cartoon-01.jpg
00000000000013.05870 Real-01.jpg
00000000000020.55470 Cartoon-03.gif
00000000000032.21900 Cartoon-02.png

As you can see the result is generally good. You can use number like 15 as a separation.

Cartoon-01.jpg is a drawing but it looks like a quite realistic one so it may be easily confused. Also Real-01.jpg is a picture of my girlfriend standing in front of an ocean so the number of colors is less than usual. This come to no surprise why the confusion happens.

What I show you here is still a raw theory. If you really want a conclusive indication you may have to find number of metrics and compare them. For example, the degree of local contrast.

Hope this will helps.

I ran your solution against a sample set of 200 pictures for each set (cartoon, real) and there is no clear distinction between the classifications. — kingb, Oct 05 '09 at 05:48
I ran this solution again, with your modifications but it still the same. I believe the sample sizes you're using, compared to what I'm using (120x90 - 320x240), is the reason there's little distinction between the two. — kingb, Oct 05 '09 at 10:29
I've just seen that your picture resolution is quite small. Because this method is relies on collective information (histogram), it is not suitable for small size picture as the numbers of colors in each picture are not so much different. In this case, I really don't know what to help you. Sorry. — NawaMan, Oct 05 '09 at 10:45

score 12 · Accepted Answer · edited Apr 28 '19 at 13:05

In theory:

One way to discriminate between cartoon and natural scene images is to compare a given image to its "smoothed" self. The motivation behind this is that a "smoothed" cartoon image statistically will not change much, where as a natural scene image will. In other words, take an image, cartoonify (i.e. smooth) it and subtract the result from the original:

isNotACartoonIndex = mean( originalImage - smooth(originalImage) )

This difference (i.e. taking its mean value) will give the level of change caused by the smoothing. The index should be high for non-smooth original (natural scene) images and low for smooth original (cartoony) images.

An SO question already discusses how to cartoonify images.

In practice:

I would suggest doing the smoothing/cartoonifying with bilateral filtering:

Bilateral filtering can be done with OpenCV using the cvSmooth function with the CV_BILATERAL parameter.

As for subtracting the cartoonyfied image from the original, I would do that with the Hue channel of the HSV images. This means you need to first convert both images from RGB to HSV.

As a side note, wanting to achieve this with an ImageMagick workflow, might be unnecessarily complicated.

score 5 · Answer 3 · answered Dec 08 '09 at 04:58

As a first pass I would try computing the entropy of the color histogram of the image. Cartoon-like images should have fewer shades of different colors, and thus a lower entropy.

This is similar to what NawaMan proposed, but this method goes one step further. The number of colors over the number of pixels may not be enough. There may be jpeg artifacts, for instance, that artificially increase the number of colors in the image, but only for a few pixels. In this case most pixels in the image would still have very few colors, which would correspond to low entropy.

Let's say you start with an RGB image. For each pixel the R, G, and B values range from 0 to 255.
You can divide this range into n bins, where n can be 16 for example. The you would count how many pixels fall into each one of these 3-dimensional bins. Then you would need to divide the values of the bins by the total number of pixels, so that your histogram sums up to 1. Then compute the entropy, which is - sum_i p_i * log(p_i), where p_i is the value of the ith bin.

Try it with different values for n, and see if you can separate the real images from cartoons.

score 0 · Answer 4 · answered Oct 05 '09 at 04:57

0

This is an Image-classification problem which AFAIK ImageMagick will NOT be able to do.

opencv (which deals with computer vision) might be of more help, for some idea on how an "image classifier" is trained with training data.

answered Oct 05 '09 at 04:57

Alphaneo

12,079
22
71
89

Image Classification - Detecting an image is cartoon-like

4 Answers4

In theory:

In practice:

Linked