separating text and image regions from an image code

Question

Separating image and text regions from an image is a very old problem and many papers have been written about it. One of the recent ones can be found here.

But I didn't find any existing code for this. Before implementing one, I thought it might be a good idea to ask SO community if anyone knows of an existing one.

Please point me to an existing code (preferably Java) if you know.

Duplicate of http://stackoverflow.com/questions/1813881/java-ocr-implementation — Codey McCodeface, Mar 27 '13 at 07:43
@medPhys-pl: I don't think it's a duplicate. rivu is asking for a segmentation algorithm, not a full OCR. — rold2007, Mar 27 '13 at 20:53

score 0 · Answer 1 · answered Mar 27 '13 at 20:51

0

I haven't read your PDF completely but from what I saw you can find a similar algorithm implemented in C# in AForge.Net. Converting the code to Java shouldn't be a big deal.

See HorizontalRunLengthSmoothing Class and VerticalRunLengthSmoothing Class

answered Mar 27 '13 at 20:51

rold2007

1,297
1
12
25

Thanks. I found that the closest of what I need to do is probably implemented in leptonica :http://tpgit.github.com/UnOfficialLeptDocs/leptonica/document-image-analysis.html?highlight=page%20segmentation – rivu Mar 27 '13 at 23:49

separating text and image regions from an image code

1 Answers1