4

Is there anyway (commandline tools) to calculate MD5 hash for .NEF (also .CR2, .TIFF) regardless any metadata, e.g. EXIF, IPTC, XMP and so on?

The MD5 hash should be same once we update any metadata inside the image file.

I searched for a while, the closest solution is:

exiftool test.nef -all= -o - -m | md5

but 'exiftool -all=' still keeps a set of EXIF tags in the output file. The MD5 hash can be changed if I update remaining tags.

XHou
  • 41
  • 2
  • I found a solution here: http://stackoverflow.com/questions/23984963/. `exiv2 rm` works best. `exiftool` and `convert` can't remove all metadata from .nef FILE. I tried `exiv2 rm | md5` on my origin .NEF file and the file outputted by `exiftool -all=`. The results are same. The output file of `exiv2 rm` can no longer be displayed. But I only need MD5 hash keeps same after updating any metadata of the .NEF file. It works perfect for my requirements. – XHou Jul 09 '15 at 06:20

3 Answers3

5

ImageMagick has a method for doing exactly this. It is installed on most Linux distros and is available for OSX (ideally via homebrew) and also Windows. There is an escape for the image signature which includes only pixel data and not metadata - you use it like this:

identify -format %# _DSC2007.NEF
feb37d5e9cd16879ee361e7987be7cf018a70dd466d938772dd29bdbb9d16610

I know it does what you want and that the calculated checksum does not change when you modify the metadata on PNG files for example, and I know it does calculate the checksum correctly for CR2 and NEF files. However, I am not in the habit of modifying RAW files such as you have and have not tested it does the right thing in that case - though I would be startled if it didn't! So please test before use.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • I tried `identify -format` on my `.NEF`. The results are different for origin file and the output file of `exiftool test.nef -all=` :( BTW, I agree with you that change .NEF is not a good habit. – XHou Jul 08 '15 at 18:27
  • You can't expect the output of `ImageMagick` to be consistent with that of `exiftool`, they use different methods. You need to stick to one or the other - not a mixture of `IM` one time and `exiftool` the next. Use `ImageMagick` to get a checksum, diddle with the NEF file and then use `ImageMagick` again to see if it has changed. – Mark Setchell Jul 08 '15 at 18:34
  • 1
    I need the hash to be constant after any EXIF meta changes, no matter what software changes the EXIF metadata. Currently `ImageMagic` can't even work with `exiftool`. How can I expect the output of `identify` is only hashed by image data? – XHou Jul 09 '15 at 06:10
1

The reason that there is still some Exif data left is because the image data for a NEF file (and similar TIFF based filetypes) is located within that Exif block. Remove that and you have removed the image data. See ExifTool FAQ 7, which has an example shortcut tag that may help you out.

StarGeek
  • 4,948
  • 2
  • 19
  • 30
  • If the shortcut tag mentioned in the FAQ doesn't work, your idea might be the only way to go. – StarGeek Jul 08 '15 at 01:05
  • I understand the reason. I only need MD5 hash (or other hash) won't change after I update any tags. I don't really want to strip all metadata from the file. – XHou Jul 08 '15 at 18:32
0

I assume your intention is to verify the actual image data has not been tampered with.
An alternate approach to stripping the meta-data can be to convert the image to a format that has no metadata.
ImageMagick is a well known open source (Apache 2 license) for image manipulation and conversion. It provides libraries with various language bindings as well as command line tools for various operating systems.

You could try:

convert test.nef bmp:- | md5

This converts test.nef to bmp on stdout and pipes it to md5.
AFAIR bmp has no support for metadata and I'm not sure if ImageMagick even preserves metadata across conversions.
This will only work with single image files (i.e. not multi-image tiff or gif animations). There is also the slight possibility some changes can be made to the image which result in the same conversion because of color space conversions, but these changes would not be visible.

Eli Algranti
  • 8,707
  • 2
  • 42
  • 50