Using python and PIL how can I grab a block of text in an image?

Question

I have an image (*.png) which contains two blocks of text. I am trying to grab each block of text individually using the python imaging library (PIL) in python27.

I have tried to blur the image and then find the edges of the blurred block so that I can then recover the boundaries of each block (for use later with "crop"). However when I blur the image (I've tried several iterations) the "find_edges" filter simply seems to grab the edges of each character.

pic = Image.open("a.jpg")
out = pic.filter(ImageFilter.BLUR)
out = out.filter(ImageFilter.FIND_EDGES)

I guess I'm looking for something similar the photoshop "Magnetic Lasso Tool" Any idea what approach may be better?

Yes, the Find Edges filter in PIL is like the one of the same name in Photoshop. It traces the edges detected in the image -- solid shapes get turned into outlines. — kindall, Feb 22 '12 at 21:21
That is what I am after, however I'd like the whole text block to be outlined rather than each individual character, which is what is happening at the moment (despite the blurring). — user714852, Feb 22 '12 at 21:26
and what about simply creating a function that would get a bounding box out of your edges? If you have the coordinates of the latter, you can have the extrema — jlengrand, Feb 22 '12 at 23:24
As a start, you can look at this answer I posted earlier today about removing whitespace with `PIL` and `numpy` http://stackoverflow.com/questions/9396312/use-python-pil-or-similar-to-shrink-whitespace/9398422#9398422 Once you have the outer boundary you can do sometime similar to find the inner portions. — Hooked, Feb 23 '12 at 01:44

score 12 · Accepted Answer · edited May 23 '17 at 12:34

12

I would start by making a histogram of the image projected onto one axis. Take your image, crop to the outer bounding box first. An example of the projected histogram onto to the y-axis:

from PIL import Image
import numpy as np

im = Image.open("dummytext.png")
pix = np.asarray(im)
pix = pix[:,:,0:3] # Drop the alpha channel
pix = 255 - pix  # Invert the image
H =  pix.sum(axis=2).sum(axis=1) # Sum the colors, then the y-axis

enter image description here

From here, identify the largest block of white space. This determines the best y-coordinate to split at. Note how it is obvious in the histogram above. If the two text blocks are closer together you'll need a better criteria, just adapt the method to fit your needs. Once split you can crop the images separately.

edited May 23 '17 at 12:34

Community

1
1

answered Feb 23 '12 at 02:25

Hooked

84,485
43
192
261

What a nice post -- It should have been accepted as the answer! – srking Mar 04 '12 at 06:17
@srking Thanks, I'm glad it was useful to you. I'm not too worried about the acceptance, that's what the upvotes are for! – Hooked Mar 05 '12 at 02:04

Using python and PIL how can I grab a block of text in an image?

1 Answers1

Linked