0

I need to copy images from 'Asset' folder in Windows 10 which has background images automatically downloaded. Some of these images will never be displayed and at some point deleted. To make sure I have seen all the new images before they are deleted I have created a Python script that copy these images into a different folder. To efficient I need a way to compare two images those that only the new ones are copied. All I need to do is to have a function that takes two images compare them with a simple approach to be sure that the two images are not visually identical. A simple test would be to take an image file copy it and compare the copy and the original, in which case the function should be able to tell that those are the same images. How can I compare two images in python? I need simple and efficient way to do it. Several answers I have read are a bit complicated.

Amani
  • 16,245
  • 29
  • 103
  • 153
  • 1
    what do you mean by "comparing "? exactly pixel value or similar images with different quality ? – minhhn2910 Jul 29 '17 at 07:29
  • And the compare function only needs to return "identical" or "different" ? – minhhn2910 Jul 29 '17 at 07:30
  • I suppose that the complicated answers you've seen are comparing the actual images. You don't really need to do that. You just need to test if two files contain the exact same bytes. And the easy way to do that is to compare hashes of those files. You could use (for example) MD5 or SHA-256 from the standard `hashlib` module. – PM 2Ring Jul 29 '17 at 07:31
  • Is it important that they are images? If you had a solution could it work equally well whatever the files are? – Peter Wood Jul 29 '17 at 07:32
  • Take a look at the answers here: [Finding duplicate files via hashlib?](https://stackoverflow.com/questions/18724376/finding-duplicate-files-via-hashlib) – PM 2Ring Jul 29 '17 at 07:35
  • @PeterWood, yes they must be images. – Amani Jul 29 '17 at 07:38

2 Answers2

5

I encountered a similar problem before. I used PIL.Image.tobytes() to convert the image to a byte object, then call hash() on the byte object and compared the hash values.

user2341726
  • 294
  • 1
  • 9
  • From your experience, have you encountered a situation where this did not work? – Amani Jul 29 '17 at 08:47
  • No, I am just using that captures a few screenshots and make sure I don't make duplicate copies. Unless you are processing billions of images, I won't worry about collisions. – user2341726 Jul 29 '17 at 08:52
  • @Amani I strongly recommend that you use a cryptographic hash function from `hashlib` for this, rather than the built-in `hash`, to reduce the chance of hash collisions. And there's really no need to use `PIL.Image.tobytes`, unless you've converted the image files to different image file formats. Just uses the hashes of the bytes in the files. – PM 2Ring Jul 29 '17 at 09:06
3

Compare two images in python

Option 1: Use ImageChops module and it contains a number of arithmetical image operations, called channel operations (“chops”). These can be used for various purposes, including special effects, image compositions, algorithmic painting, and more.

Example:

ImageChops.difference(image1, image2) ⇒ image

Returns the absolute value of the difference between the two images.
out = abs(image1 - image2)

Option 2:

Scikit-image is an image processing toolbox for SciPy.

In scikit-image, please use the compare_ssim to Compute the mean structural similarity index between two images.

References:

Python Compare Two Images

Community
  • 1
  • 1
jose praveen
  • 1,298
  • 2
  • 10
  • 17