Find contents of grid in an image

Question

I have a scanned word document containing a table. I need to extract the contents of every cell/rectangle in the scanned image. For example, take a look at this image: enter image description here

Given that image, I need to retrieve an array of rectangles (Coordinates) in c# for each cell in the image. I'm using AForge but this is not a requirement.

What I've tried:

I've tried using blob processing. This works to some extend but not always. With some images it is able to retrieve 80-90% of the cells, while in some others it only retrieves 1 blob (The whole image).
I've tried applying the following filters: Grayscale -> Otsu Thresholding -> Canny edge detection and then processing the final image with hough line transform. I was hoping it would keep the straight lines as black and everything else as white which would make the task much easier using a custom algorithm. However, it either detects additional lines (Probably from the text) or skips some of the lines between cells.

I've tried applying different combinations of filters in both of my attempts but I was unsuccessful. How can I achieve something like this?

Maybe you can use some ideas from here: http://stackoverflow.com/q/10196198/143605 — Niki, Nov 30 '13 at 12:24
@nikie thank you, very helpful indeed. I will try a similar approach with my problem and update my post with the results (Already working on it). — Orestis P., Dec 01 '13 at 18:47
With the AForge BlobCounter, have you tried adjusting the BackgroundThreshold color? It helped my job to detect the rectangles when the borders were lighter. — John Kurtz, Oct 17 '18 at 21:15

Find contents of grid in an image

0 Answers0