0

I have a problem_page such that

from PIL import Image

problem_page = "/home/rajiv/tmp/kd/pss-images/f1-577.jpg"
img = Image.open(problem_page)

results in

PIL.Image.DecompressionBombError: Image size (370390741 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.

I'd like to respect the limit and not increase the limit (as described here: Pillow in Python won't let me open image ("exceeds limit"))

How can I load it in a way that the resolution is lowered just below the limit and the lower resolution image is referenced in img without causing any error.

It'd be great to have a Python solution but if not, any other solution will work too.

Update(to answer questions in comments):

These images are derived from PDFs to do machine learning(ML). These PDFs come from outside the system. So we have to protect our system from possible decompression bombs. For most ML, pixel size requirements are well below the limit imposed by PIL so we are ok with that limit as a heuristic to protect us.

Our current option is to use pdf2image which converts pdfs to images and specify a pixel size (e.g. width=1700 pixels, height=2200 pixels) there but I was curious if this can be done at the point of loading an image.

RAbraham
  • 5,956
  • 8
  • 45
  • 80
  • Not sure I understand why you want to respect an entirely arbitrary limit when your work requires you to process larger images? What's the objection to changing it please? – Mark Setchell Jan 18 '23 at 10:01
  • @MarkSetchell. Thanks for the question. I've updated the OP with an update section. lmk – RAbraham Jan 18 '23 at 16:24
  • Another option might be to use a docker image with memory limits to decompress/test in order to isolate that aspect of your processing. – Mark Setchell Jan 18 '23 at 17:19

0 Answers0