Compare Images in Python

Question

I need to compare two images that are screenshots of a software. I want to check if the two images are identical, including the numbers and letters displayed in the images. How can this be accomplished?

What do you mean when you say compare? Do you want to see if they are identical? Are you looking for details on how to do this in Python or how to compare images in general? — Guy Sirton, Aug 05 '12 at 20:58
yes i wanted to compare to see if they are identical including the numbers/letters displayed in the software — stallion, Aug 06 '12 at 04:53

score 13 · Accepted Answer · answered Aug 05 '12 at 17:04

13

There are following ways to do the proper comparison.

First is the Root-Mean-Square Difference #

To get a measure of how similar two images are, you can calculate the root-mean-square (RMS) value of the difference between the images. If the images are exactly identical, this value is zero. The following function uses the difference function, and then calculates the RMS value from the histogram of the resulting image.

# Example: File: imagediff.py

import ImageChops
import math, operator

def rmsdiff(im1, im2):
    "Calculate the root-mean-square difference between two images"

    h = ImageChops.difference(im1, im2).histogram()

    # calculate rms
    return math.sqrt(reduce(operator.add,
        map(lambda h, i: h*(i**2), h, range(256))
    ) / (float(im1.size[0]) * im1.size[1]))

Another is Exact Comparison #

The quickest way to determine if two images have exactly the same contents is to get the difference between the two images, and then calculate the bounding box of the non-zero regions in this image. If the images are identical, all pixels in the difference image are zero, and the bounding box function returns None.

import ImageChops

def equal(im1, im2):
    return ImageChops.difference(im1, im2).getbbox() is None

answered Aug 05 '12 at 17:04

NIlesh Sharma

5,445
6
36
53

A "magic number" caused me an issue. I suggest replacing `range(256)` with `range(len(h))` for robustness. – shermy Apr 07 '18 at 03:58
Small quibble I should mention: the big one-liner (return statement) might be nicer broken out. But if not, I'd personally at least give the lambda param h another name, just for clarity: eg: `map(lambda x, i: x*(i**2), h, range(len(h)))`. Cheers! ;) – shermy Apr 07 '18 at 04:02
Okay. For RGB palette I got 768 values (256 consecutive for R,G,B). So first, the modulus of 256 is needed to make each channel "click over" to zero in our histogram, else, the weighted frequency of the second colour channel, for example, will be 256 times out! Each successive channel would be +256 further out! Also, I noted the need to transpose h and I. This worked: `return math.sqrt(reduce(operator.add, map(lambda h, i: i%256*(h**2), h, range(len(h))) ) / (float(im1.size[0]) * im1.size[1]))` – shermy Apr 08 '18 at 03:34
Trying your exact comparison example, I'm getting odd results. `diff.getbbox() is None` is `True` even though `im2` is just a copy of `im1` that I opened in paint and drew all over. Any thoughts? – Brian Sterling Feb 02 '22 at 17:40

jterrace · Answer 2 · 2013-10-13T18:08:58.017

10

I'm maintaining a Python library called pyssim that uses the Structured Similarity (SSIM) method to compare two images.

It doesn't have python bindings, but the perceptualdiff program is also awesome at comparing two images - and quite fast.

edited Oct 13 '13 at 18:08

answered Aug 05 '12 at 17:19

jterrace

64,866
22
157
202

is the pyssim a sort of a fuzzy comparison? what if there was a boxes of text in an image, the positions of boxes are same with another image, but the text is slightly different? What score would it return? does it only consider structures? – user299709 Aug 03 '14 at 21:51
If the text is different, it will have a lower score. – jterrace Aug 04 '14 at 00:57
Am I correct to say the 2 images compared needs to be of the same dimension? – Lester Cheung Sep 30 '18 at 14:29
Yes, you'd have to resize one first so they match. – jterrace Oct 01 '18 at 16:07

Fredaroo · Answer 3 · 2021-12-02T18:02:12.123

For anyone who stumbles upon this and for whom the accepted answer did not work, I'm posting this here.

I had a similar scenario where I needed to compare one image with thousands of others and find the one that was the closest resembling. I ended up starting off with ImageChop's difference function and applying a mean like so :

import numpy as np

def calcdiff(im1, im2):
    dif = ImageChops.difference(im1, im2)
    return np.mean(np.array(dif))

By turning the difference image into an array I'm able to calculate the mean difference. The lower the difference the more closer the image compared was to the original.

Note: Another approach that worked on near to complete resemblance is to convert the ImageChops.difference(im1, im2) to a numpy array and then to substract exact match pixels [0, 0, 0] to the array. Then by calculating the len() of the array we obtain a score which allows us to differentiate between the images. The closest one being the one with the smallest score

score 1 · Answer 4 · answered Aug 05 '12 at 15:06

I can't give a ready to use answer, but I will point you in (I think) the right direction. A simple way of comparing two images is by making a hash of their binary representations and then see if those hashes are the same. One problem with this is with the hash function you want to use and you must look for one that have low chances of collisions, and the other is that an image file probably has metadata attached to the original binary information, so you will have to look at how to cut off that metadata in order to compare the images only using their binary info. Also, I don't know for sure but probably the binary representation of an image encoded in jpg is different from an image encoded in png, so you should be aware of that.

Compare Images in Python

4 Answers4

Linked

Related