2

I am interested in analyzing a scanned document, a form, and I want to be able to detect if someone has checked or filled in a check box in various places in the form (similar to perhaps a scantron), and maybe capture the image of a signature and such.

Since these check boxes will be at known locations it seems I might could ask for a few pixels at (x,y) and average them if its darker than N threshold then its checked. However, I imagine that scanning in could introduce a large shift in the actual position, relative to the edge of the image.

As it is clear I am a newbie in this area, does a framework exist (open source, or commercial) or any patterns or examples anyone could point me to, to start down this path. (Or might this be impossible to do in .net, and I should start looking into managed application?)

aceinthehole
  • 5,122
  • 11
  • 38
  • 54
  • 1
    Might be a duplicate question: http://stackoverflow.com/questions/8576652/how-to-programmatically-read-over-a-scanned-document-or-image?rq=1 – aceinthehole Feb 27 '14 at 16:00
  • 3
    Have a look at the image processing library OpenCV, it also available for C#: http://www.emgu.com/wiki/index.php/Main_Page – alex Feb 27 '14 at 16:01
  • 1
    Try researching [OCR](http://en.wikipedia.org/wiki/Optical_character_recognition) libraries. – mbeckish Feb 27 '14 at 16:06
  • 1
    We actually do exactly this at our organization. We could not find a suitable framework for what we were doing but the algorithm was fairly simple to build. Crop a bitmap at the "known location" of the checkbox, do a `GetPixel(x,y)` over the area of the box and then compare the numbers to determine a value. You can also save the cropped image if you want to capture the actual visible area of the checkbox. – Evan L Feb 27 '14 at 17:18
  • 1
    Any number of image processing frameworks could be used for this application. OpenCV is free, which is a nice price, but not the most straightforward to use or necessarily the best. With any image processing application, please post at least one sample image. For those of us who are professionals in the field and who have to produce reports for prospective customers, this is a must. – Rethunk Mar 01 '14 at 06:05

2 Answers2

0

This is referred to as ICR (Intelligent Character Recognition).
It is an established field. ICR does edge detection as a skewed scan is common.
You can try and do it yourself but there is a lot to it.

Leadtools is not free and I don't work for them
But this is a good example of ICR as a tool (SDK)
LEADTOOLS ICR SDK

If you have the documents in paper another option is to take them to a commercial scan vendor.
They will have software designed for ICR.
They also have high end scanners meant to work with the ICR.

paparazzo
  • 44,497
  • 23
  • 105
  • 176
  • I'm pretty sure ICR/OCR is _well_ beyond what the OP needs for this task. – wbest Feb 28 '14 at 00:18
  • @wbest Really check box at various places and signature with edge detection is WELL beyond ICR? If that is not ICR then what is it? – paparazzo Feb 28 '14 at 01:20
  • OP compared what he was looking for to Scantron. If he can be relatively confident of where the boxes are in the sheet, we can just look in that ROI. No OCR required. – wbest Feb 28 '14 at 15:17
  • @wbest Why do you keep saying OCR - that is not in my answer. Really he can be confident of where the boxes are? Then explain this statement. "I imagine that scanning in could introduce a large shift in the actual position, relative to the edge of the image." – paparazzo Feb 28 '14 at 15:34
  • You are referring to something akin to Optical Character Recognition, right, or does ICR work differently? Also, you can frequently accommodate for shifts in the image through basic image processing methods. But given the OP is new to the field of CV/IP, perhaps you are right that commercial is the way to go. – wbest Feb 28 '14 at 16:01
  • @wbest So what if OCR is akin. I call finding a checkbox on a skewed page ICR (and I used to manage a scan shop). If you know how to accommodate for shift with basic image processing methods then post it. But I would still call it ICR and I have found it not to be basic. – paparazzo Feb 28 '14 at 16:30
  • 1
    OCR/ICR not needed for this application at all. Standard template match (normalized correlation), connected components, and similar basic algorithms would be sufficient, especially since the ovals to be filled in are located in a grid. – Rethunk Mar 01 '14 at 06:06
  • 1
    @Rethunk The problem does *not* state oval to be filled in a located in a grid. It states check boxes at various place. – paparazzo Mar 01 '14 at 13:32
  • 1
    Okay: ovals, squares, circles, stars, the shape doesn't matter as long as it's the same shaped used throughout the image. Same point as before: OCR isn't necessary though OCR algorithms could be trained to find arbitrary shapes. There happen to be better tools for the job, though. – Rethunk Mar 02 '14 at 17:46
  • 1
    @Rethunk Where did I say OCR? If you have a better tool then post it. I do not state that was the best tool - only an example. If ovals in a grid does not matter then why did you state "especially since"? – paparazzo Mar 02 '14 at 17:56
0

I'm not familiar with .NET image processing, but I know image processing in general. So I'll give you the theory, and references to OpenCV.

To accommodate for skewing of the image, look into Fourier transforms, and Hough Transforms and Hough Lines. What you'd basically want to do is to run the fourier transform, then turn the results into a BW image. Find the strongest lines for HoughLines, and then keep the longest of them. This line will be one of the axis lines, in my experimentation, it was usually the vertical axis. Find the angle of deviation from a straight vertical line, and then (depending on the particular rotation algorithm) rotate the image by the negative of this amount.

If the rotation algorithm fills in with 0's (or with a white that's too far off the color of the image) you can crop the image using angle found earlier to calculate the deviation (This is where all that trig you learned in school comes in handy).

Then find the bounding box that encloses the text on the page and crop down to that. When checking to see if a box is checked or not, you'll want to look in areas, probably about 5-10 pixels larger than than the size of the checkbox depending on resolution, to get checkbox ROI.

With this, you might want to see if x% of the ROI is written in to verify if the box was checked or not.

wbest
  • 611
  • 1
  • 6
  • 15