I am taking screenshots of an application, and trying to detect if the exact image has been seen before. I am looking to detect trivial changes as different - e.g. if there is text in the image, and the spelling changes, that counts as a mismatch.
I've been successfully using an MD5 hash of the contents of an screen-shot image to lookup in a database of known images, and detect if it has been seen before.
Now, I have ported it to another machine, and despite my attempts to exactly match configurations, I am getting ever-so-slightly different images to the older machine. When I say different, the changes are minute - if I blow up the old and new images and flick between then, I can't see a single difference! Nonetheless, ImageMagick's compare
command can see a smattering of pixels that are different.
So my MD5 hashes are no longer matching. Rather than a simple MD5 hash, I need an image hash.
Doing my research, I find that most of the image hashes try to be fairly generous - they accept resized, transformed and watermarked images, with a corresponding false positive matches. I want an image hash that is far more strict - the only changes permitted are minute changes in colour.
Can anyone recommend an image hash library or algorithm? (Not an application, like dupdetector).
Remember: My requirements are different from the many similar questions in that I don't want a liberal algorithm like shrinking or pHash, and I don't want a comparison tool like structural similarity or ImageMagick's compare.
I want a hash that makes very similar images give the same hash value. Is that even possible?