1

i'm new here. i using python 3.7 with pillow. i have a lot of images which contain a combination of math formula and arabic word, and i want to crop these image according to the getbbox() function which calculates the boundary box of the non zero region example1. it work well but in the case of long formula doesn't crop all the content of the image example2.

im = Image.open(buf)
bg = Image.new(im.mode, im.size, white)
diff = ImageChops.difference(im, bg)
diff = ImageChops.add(diff, diff)
bbox = diff.getbbox()
im.crop(bbox).save('output\\original_data\\'+str(i)+'_'+str(j)+'_original.png')
Hamed
  • 23
  • 5
  • Have you tried inverting the image first and then doing getbox? It looks almost like some of the text decorations might just get picked up as noise and filtered out. This question might be useful; https://stackoverflow.com/questions/9870876/getbbox-method-from-python-image-library-pil-not-working – Mark Carpenter Jr Mar 10 '20 at 13:32
  • i already tried this solution but i obtained raise IOError("not supported for this image mode") OSError: not supported for this image mode – Hamed Mar 10 '20 at 15:31
  • Ah, makes sense invert is for color images. I think you might be able to do something like `invert = ~bg` to invert the image. – Mark Carpenter Jr Mar 10 '20 at 18:46
  • thanks for your suggestion but i work on a black and white image – Hamed Mar 10 '20 at 18:49
  • There doesn't appear to be an arabic word following the second formula example? – Mark Setchell Mar 11 '20 at 14:45
  • no, there is arabic word but the crop function does not crop depend on the result of getbbox function – Hamed Mar 15 '20 at 09:46

0 Answers0