1

I am working on text extraction form image. For this I am using edge detection technique. I detected edges of a image with text or non text regions.
Now I want to eliminate the non-text region from image.
Please tell me how can I do this?

The code I have so far is:

i = imread('t1.jpg');
i1 = rgb2gray(i);
imshow(i1);

i2 = edge(i1,'canny',0.3);
imshow(i2);

se = strel('square',2);
i3 = imdilate(i2,se);
imshow(i3);

i4 = imfill(i3,'holes');
imshow(i4);

[Ilabel num] = bwlabel(i4);
disp(num);
Iprops = regionprops(Ilabel);
Ibox = [Iprops.BoundingBox];
Ibox = reshape(Ibox,4,[]);
imshow(i);

hold on;
for cnt = 1:size(Ibox,2)
   rectangle('position',Ibox(:,cnt),'edgecolor','r');
end

enter image description here

Shai
  • 111,146
  • 38
  • 238
  • 371
Noman Malik
  • 103
  • 3
  • 5
  • 18
  • What exactly is the desired output? Is it an image where any pixel that doesn't fall within a red square is white/black/`NaN`? Or perhaps would you like to get a cell array of images, where each image is a different red-square enclosure? Please explain what you mean by "eliminate". – Dev-iL Apr 20 '16 at 22:16
  • I want to differentiate between text region boxes and non text region boxes. And than extract text from red boxes. – Noman Malik Apr 21 '16 at 10:10

1 Answers1

0

You might consider using a more "text" oriented method.
Have you considered using "Stroke-Width Transform" (SWT)? This transformation filter edges according to the possibility of the edges being part of a fixed width ridge that is usually a characteristic of text.

Community
  • 1
  • 1
Shai
  • 111,146
  • 38
  • 238
  • 371