3

I am interested in doing some snail mail based surveys but I am looking for quick ways to digitize the surveys they send back.

So if I had a question and 5 boxes beneath it where you would indicate your opinion by checking the appropriate box, does anything exist where I could scan it and run it through a piece of software that spit out the responses.

Edit clarification:

I am inquiring about what I need to do after the paper has been digitized. I want to write some code that looks at an image file and recognizes which box has been marked in and outputs a representation of the respondents answers.

I would be looking at a page scanned from a desktop scanner or something similar.

jimstandard
  • 1,067
  • 2
  • 10
  • 17
  • 1
    Do you own a Scantron device? Or are you trying to emulate a scantron device using a desktop scanner? What are you trying to do? – S.Lott Jan 09 '12 at 19:27
  • If you were willing to let go of python, you might find auto-multiple-choice (home.gna.org/auto-qcm/) of interest... – dat Dec 19 '13 at 15:36

3 Answers3

3

From what i see you don't really need ICR (intelligent character recognition, used for handwritten and handprinted texts), but what you need is OMR - optical mark recognition (capturing human-marked data from document forms such as surveys and tests).

The bad news is you would hardly find an opensource library for python. But there's a solution - you can use a cloud SDK, it's a website that let you upload an image and send you back an OCR'ed data. Try www.ocrsdk.com, it is a cloud based OCR SDK recently launched by ABBYY. It's now in closed beta so it's completely free to use.

It has both ICR and OMR api methods and a set of python code samples.

Nikolay
  • 2,206
  • 3
  • 20
  • 25
2

The SDAPS project (repo) might be worth a look. It may not handle arbitrary scanned images, as it seems to expect an ODT or LaTeX document at the beginning of the process.

Overview

SDAPS is an open source (GPLv3, LPPL) optical mark recognition (OMR) program. It is written in python and has an integrated workflow with both LibreOffice and LaTeX to create questionnaires.

Workflow

enter image description here With SDAPS you create the questionnaire using either LibreOffice or LaTeX. After this some processing is done to collect the information about the survey (questions, and answers) and a printable PDF is created. The filled out questionnaires only need to be scanned in (example). SDAPS will do the optical mark recognition and can create a PDF report (example) or export the data. Optionally it is possible to manually correct the results using a graphical user interface.

bollwyvl
  • 1,231
  • 11
  • 6
2

I don't really see what this has to do with python, unless of course you've already digitized the results and are now looking to tally up the results. It sounds like you still need to scan the results in and as far as I know, python doesn't have any direct capabilities of doing something like that. You're going to have to get your hands on a scanner first, and only then can you use python to read through the data.

purpleladydragons
  • 1,305
  • 3
  • 12
  • 28
  • I am not concerned with the scanning, I am interested in a piece of code that takes an image and says "answered yes to questions 1, 4,7 and 9" Python because that is what I am most familiar with. – jimstandard Jan 09 '12 at 19:21
  • @jimstandard: Please update your question rather than add comments to answers. – S.Lott Jan 09 '12 at 19:26
  • @jimstandard Admittedly, I don't know how to write something that would be able to do that with an image, but if you can find software that converts a scanned image to text, it'd be pretty simple to write a script that looks for certain text patterns in the file. If you can't convert it to text, you might be able to look for dark pixels at the coordinates at which you expect them, but again I can't help much with finding stuff in an image. – purpleladydragons Jan 09 '12 at 19:27