0

I have a project in which I have lots of different images. Once in a while, we are adding more images inside it, but before, we need to check if it already existed (because we added it previously).

We were doing this right now manually, looking for the image in the folders, but as the project got bigger, it's pretty time consuming.

SO, I would like to create a script that given an image, it looks in a directory to check if it exists.

Do you know if there is any command line based tool or something I can use to build a script to do this?

Frion3L
  • 1,502
  • 4
  • 24
  • 34
  • Your question is ambiguous. Do you just want to check for file names? Byte-for-byte comparisons? Some kind of computer vision comparison to check similarity? – Palpatim Oct 14 '15 at 14:13
  • Just visual comparison. This images will be used in some views. We don't want to add assets that we already have. – Frion3L Oct 14 '15 at 14:33
  • Nothing like that exists as a standard shell utility, because it's a known Hard Problem. Images are noisy, hard to predict, and rely on patterns that make perfect sense to our squishy, evolved-over-millions-of-years brains, but are really hard for computers to deal with. Even a solution like using ImageMagick to convert two source images to the same format is prone to error, since lossy formats will cause differences in the final output. See http://stackoverflow.com/questions/23931/algorithm-to-compare-two-images to get you started. – Palpatim Oct 14 '15 at 14:47
  • I don't want it to be super accurate. I just need before adding a new image, if it already exists but the hash of it is not reliable as the artist could have reexported the same asset twice. The asset won't change, but its hash probably yes. So maybe a pixel comparison? I don't want to write a super complex algorithm just for this, this needs to be an easy and quick tool to check assets. – Frion3L Oct 14 '15 at 15:18
  • 1
    I think you're reducing the amount of complexity a "easy & quick" tool needs to deal with when comparing images. Image 1 upper left pixel has an RGB color value of (0,0,0), but Image 2 upper left pixel has an RGB color value of (0,1,0). To the eye they look the same, to a computer they're different. Do you match, or not? Multiply that by millions of pixels in an image, and even with relatively sophisticated smoothing algorithms, you face a difficult problem. Good luck, though, and if you do find a solution, I hope you post it here for others to benefit from. – Palpatim Oct 14 '15 at 15:37

1 Answers1

0

There is the fdupes utility which does byte to byte comparison. It has a -d or --delete option which will prompt you to ask which files it should keep when it finds duplicates. If you don't care about the filename you can ask it to keep only the first one:

fdupes --delete --noprompt

If you want to delete images that look the same but are slightly different, it's an image recognition problem which I guess does not have such a straightforward solution.

Emilien
  • 2,385
  • 16
  • 24
  • This won't work as we can't asume both files will have the same hash (as the repeated one could have been exported again from a psd file for instance). – Frion3L Oct 14 '15 at 14:39