Taking parts of one image to create another image

Question

I'm working with images from which I would like to take parts out and make one new image. I can make use of ImageMagick or OpenCV. Here is a sample image:

enter image description here

From this image I would like to take out the title, two annotated texts (one in circle one in rectangle), and the text from bottom.

So, the final image would have: Image Title, Annotated Text1, Annotated TExt, and This is some test. These parts of the image don't have to be in any particular order in the new image.

Questions

What kind of strategy can I use to do this?
Will hough or canny help?
I'm thinking that since the parts of the image I want back are all text, maybe hough line can detect the straight lines and then I crop out those parts of the images...
My main goal is to extract text so I can send it to an OCR

I've tried to erode the image and came up with this:

enter image description here

My Strategy

Following is my strategy to only keep parts of the image with white background and text. However, I'm not sure if this is doable with OpenCV...

There will be different ROI's in the image

there will always be white background on top of the image, lets call this space title. So I crop out the rectangle part on top of the image and save it as a separate image
there will always be white background at bottom of the image, lets call this body. So I crop out the rectangle part at bottom of the image and save it as a separate image
there will be some text on top of the image, lets call this annotated text. This will be in squares or circles. I can use technique mentioned in this answer to crop out those parts of the image and save them as a separate image.

Why not use OCR on the image as is? the text is already clean and on white background. — Bitwise, Apr 01 '13 at 03:09
This is a sample image. In other certain images the text is really close to the square and circles. In those cases I'm only able to read the title and text below the image, not the annotated text. To get better success ratio, I wanted to be able to take out parts of the image and feed them to OCR or pre-process the image such a way that nothing other than the text is left in the image — birdy, Apr 01 '13 at 03:23
Text detection is normally a machine learning phase in the pipeline. If you have known constraints on font type and/or size, then perhaps using a sliding window technique to train an SVM on known examples would be a starting point. OpenCV has many ML examples, isn't OCR among them? — Roger Rowland, Apr 01 '13 at 08:58
Also, take a look at this paper - http://www.math.tau.ac.il/~turkel/imagepapers/text_detection.pdf — Roger Rowland, Apr 01 '13 at 09:11
I came across some of those research papers but most of them are detecting natural text. Text in my images is always going to be pretty straight since someone will always be adding text to an existing image. I'm updating the question with my strategy. — birdy, Apr 01 '13 at 11:08

score 0 · Answer 1 · answered Apr 27 '13 at 05:47

If you are dealing with only similar looking fonts, and you are not looking for something super efficient, you can simply perform correlation with each letter of the alphabet (26 upper and 26 lower). Threshold out the peaks and add them together. You can then just define you bounding boxes around the peaks.

Taking parts of one image to create another image

1 Answers1