2

Data description

  • Dataset contains forms taken as pictures (so quality varies greatly and angles are rarely straight).
  • All forms follow the same template and have different elements that can be extracted through template matching (if angles are straight)

Here's a cropped image sample from the template to get an idea. Real data is much dirtier than the template. Ideally one would be able to extract the positions of the different segments of the form like in the red square with high precision.

Image example

Aim

Manage to match the different parts of the forms in the dataset to a template. This should allow for image alignment and facilitate information extraction

What has been tried

  • Preprocessing : basically follows this answer

  • Information extraction : Used opencv's multi scale template matching. Works half-decently when angles are aligned. Doesn't work when angles aren't (that's why I'm wondering if it's better to find a way to align angles first)

  • Image alignment : Tested homography following this tutorial. Results were mediocre at best. I suspect issue with preprocessing

I'm really interested in your thoughts on the matter

Wajd Meskini
  • 94
  • 1
  • 6
  • Any news on how you solved the issues? – Alexander Langer Oct 25 '21 at 17:26
  • 1
    To be honest, I kept working on my preprocessing with the aim of multi-scale template matching. Results were improving but I switched projects since then I had found a way to fix the skewness if it's very light using this tutorial [link](https://stackoverflow.com/questions/57964634/python-opencv-skew-correction-for-ocr/57965160#57965160) – Wajd Meskini Oct 28 '21 at 07:09

0 Answers0