3

I have an image (*.png) which contains two blocks of text. I am trying to grab each block of text individually using the python imaging library (PIL) in python27.

I have tried to blur the image and then find the edges of the blurred block so that I can then recover the boundaries of each block (for use later with "crop"). However when I blur the image (I've tried several iterations) the "find_edges" filter simply seems to grab the edges of each character.

pic = Image.open("a.jpg")
out = pic.filter(ImageFilter.BLUR)
out = out.filter(ImageFilter.FIND_EDGES)

I guess I'm looking for something similar the photoshop "Magnetic Lasso Tool" Any idea what approach may be better?

user714852
  • 2,054
  • 4
  • 30
  • 52
  • Yes, the Find Edges filter in PIL is like the one of the same name in Photoshop. It traces the edges detected in the image -- solid shapes get turned into outlines. – kindall Feb 22 '12 at 21:21
  • That is what I am after, however I'd like the whole text block to be outlined rather than each individual character, which is what is happening at the moment (despite the blurring). – user714852 Feb 22 '12 at 21:26
  • and what about simply creating a function that would get a bounding box out of your edges? If you have the coordinates of the latter, you can have the extrema – jlengrand Feb 22 '12 at 23:24
  • 1
    As a start, you can look at this answer I posted earlier today about removing whitespace with `PIL` and `numpy` http://stackoverflow.com/questions/9396312/use-python-pil-or-similar-to-shrink-whitespace/9398422#9398422 Once you have the outer boundary you can do sometime similar to find the inner portions. – Hooked Feb 23 '12 at 01:44

1 Answers1

12

I would start by making a histogram of the image projected onto one axis. Take your image, crop to the outer bounding box first. An example of the projected histogram onto to the y-axis:

from PIL import Image
import numpy as np

im = Image.open("dummytext.png")
pix = np.asarray(im)
pix = pix[:,:,0:3] # Drop the alpha channel
pix = 255 - pix  # Invert the image
H =  pix.sum(axis=2).sum(axis=1) # Sum the colors, then the y-axis

enter image description here

From here, identify the largest block of white space. This determines the best y-coordinate to split at. Note how it is obvious in the histogram above. If the two text blocks are closer together you'll need a better criteria, just adapt the method to fit your needs. Once split you can crop the images separately.

Community
  • 1
  • 1
Hooked
  • 84,485
  • 43
  • 192
  • 261
  • What a nice post -- It should have been accepted as the answer! – srking Mar 04 '12 at 06:17
  • @srking Thanks, I'm glad it was useful to you. I'm not too worried about the acceptance, that's what the upvotes are for! – Hooked Mar 05 '12 at 02:04