0

Is there any suggestion (piece of code) that helps me to understand the concept of how to compare two different image files whether they are same or not?

Thanks in advance.

EDIT: I mean, for example, I should check the CRC32 (that which I do not know how to do) with size check of the file. Then it means they are identically same pictures...

EDIT2: When I say images are the same, I mean images are looks exactly the same to the user.

javatar
  • 4,542
  • 14
  • 50
  • 67
  • 2
    If you have lossy compression, you need to define what you mean by "same" Two imagines can be effectively identical to the human eye, but one can be half the size of the other. – Peter Lawrey Oct 01 '12 at 12:23
  • Yep, a CRC of an image file will only tell you if the file is probably identical to another copy of the same file. It will not tell you if the image inside is probably the same, where the files were separately created. – Hot Licks Oct 01 '12 at 12:26
  • 1
    If you are using a CRC32 check, you are checking they are byte for byte exactly the same. This means two pictures which contain the same information and look the same will not be the "same". – Peter Lawrey Oct 01 '12 at 12:30
  • @HotLicks if the file is identical to another copy of the same, logically, to my mind, it means the image inside them are the same as well. Am I wrong and how? Waiting for your replies, thanks guys. – javatar Oct 01 '12 at 12:30
  • @Peter can you give more detailed explanation? Do you mean, using CRC32 should give a result that two images are different even if they have the same information in them and look the same? Thanks – javatar Oct 01 '12 at 12:34
  • 1
    CRC32 will give you a different number even if the meta data attached to the image is in a different order (something you can't even see) – Peter Lawrey Oct 01 '12 at 12:39

2 Answers2

1

You can use CRC32 to sum any file. However if you want to find out if two images are the same you have to decide whether two images which look the same are the same. e.g. The following images all have different sizes let alone different CRC32 numbers.

enter image description here enter image description here enter image description here enter image description here

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Thanks Peter, I must have given that detail at first; in a nutshell, which picture is look as the same to the end user, is the same for me! So those images you provide are the same for me:) Hence what I understand is CRC32 is not adequate? So what is your suggestion to suit my needs for my case? Thanks. – javatar Oct 01 '12 at 12:46
  • You need to convert them to a common format (usually uncompressed bitmap) and compare the pixels to see if they are similar. e.g. you take the distance in the colours `sqrt(sum(x1 - x2))` when this number is 0 they really are the same. If the number is very small (and you have to decide what that is) they are the same, when the number is large they are not. There will be a grey area where it is hard to classify. You may need to transform the image as well such as scaling and cropping e.g. if the image has one extra blank line it is the same even though its not the same size. ;) – Peter Lawrey Oct 01 '12 at 13:02
  • To compare images to see if they "look" the same you must use relatively sophisticated signal processing techniques. http://dsp.stackexchange.com/ would be where you'd ask further questions. – Hot Licks Oct 01 '12 at 15:38
1

The checksum for the zip entry has the meaning: when different the files are different.

The CRC32 class allows you to calculate the checksum yourself of bytes.

To efficiently check whether two images are almost equal, there are many means, like making a small 8x8 image and comparing the differences in color values.

Community
  • 1
  • 1
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138