1

assume I have a image pic.jpg. I read the image and then save it.

from PIL import Image
im = Image.open('pic.jpg')
im.save('pic1.jpg', 'jpeg')

The md5 sum of the two pictures are different:

$ md5sum pic.jpg
3191102e44fa5ebbb2aa52e042066dad
$ md5sum pic1.jpg
a6b17e3af3ff66715a2326db33548d11

Do I still have the original image if I read and then save with PIL?

coin cheung
  • 949
  • 2
  • 10
  • 25

2 Answers2

1

After some comparision it seems like PIL recompress the image (all binary data are not the same), also headers change (for me an Adobe header, title and author was here but disapear).
If you would do some comparision you may can do :

xxd pic.jpg > pic.hex
xxd pic1.jpg > pic1.hex
diff pic.hex pic1.hex
Jorropo
  • 348
  • 3
  • 12
1

No, JPEG is lossy. It throws away information to make your image smaller. Different coders/decoders (i.e. writers/readers) throw away different information and choose different quality settings.

If you want to be able to save and reload your image and it to be identical, you need to use a lossless format such as PNG.

Even then, your image potentially contains the date and time, so if you load or create an image and save it, then save it again 2 seconds later, the two files will have hashes that differ.

See also here and here.

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432