0

I have the following code. It seems like img_1 and img_2 are not the same. Looking at the pixels, I can tell there is a small diff, which can be hardly seen, but becomes a big deal later. Why does it happen and how to read the version of img_2 directly from s3?

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
import io
from PIL import Image, ImageChops

        
s3 = boto3.resource('s3', region_name='us-east-1')
bucket = s3.Bucket(bucket)
key = "some_key.jpeg"
object = bucket.Object(key)
response = object.get()
file_stream = response['Body']
img_1 = Image.open(file_stream)

img_1.save('/tmp/img1.jpeg')
img_2 = Image.open('/tmp/img1.jpeg')
diff = ImageChops.difference(img_1, img_2)
    
if diff.getbbox():
   print("images are different")
else:
   print("images are the same")
  • https://stackoverflow.com/a/64954285/2836621 – Mark Setchell May 02 '22 at 13:51
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community May 02 '22 at 15:22
  • @MarkSetchell what if the original image placed in s3 was in jpeg format? Why would it loose another piece of data? – user19015712 May 02 '22 at 18:10
  • 1
    JPEG is lossy. It is allowed to lose data... in order to save space or transmission time. If you want your images to retain their quality, you need to use a lossless format. – Mark Setchell May 02 '22 at 18:36

1 Answers1

0

Probably because of a quality factor? Perhaps the default save() option uses a quality less than the original?

What about when you save using img_1.save('/tmp/img1.jpeg', quality="maximum")

More information on JPEG image quality in PILLOW here: https://pillow.readthedocs.io/en/stable/reference/JpegPresets.html

Jonathan
  • 748
  • 3
  • 20
  • Unfortunately, seems like it didn't resolve the issue – user19015712 May 02 '22 at 14:23
  • I am fairly sure it has to do with the quality: quality The image quality, on a scale from 0 (worst) to 95 (best), or the string keep. The default is 75. Values above 95 should be avoided; 100 disables portions of the JPEG compression algorithm, and results in large files with hardly any gain in image quality. The value keep is only valid for JPEG files and will retain the original image quality level, subsampling, and qtables. (source: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html?#jpeg) – Jonathan May 03 '22 at 12:14
  • You can try `quality="keep"` although even then I get a bounding box when comparing. I am not sure if your method is 100% valid. – Jonathan May 03 '22 at 12:21