Affine Transform, Simple Rotation and Scaling or something else entirely?

Question

The scenario goes like this: I have a picture of a paper that I would like to do some OCR. So take the image below as my input example:

orig_image

After successfully detecting the area that corresponds to the paper I'm left with a vector<Point> of 4 coordinates that define its location inside the image. Note that these coordinates will probably not correspond to a perfect rectangle due to the distance of the camera and angle when the picture was taken. For viewing purposes I connected the points in the sub-image so you can see what I mean:

detected_image

In this case, the points are: [1215, 43] , [52, 67] , [56, 869] and [1216, 884]

At this moment, I need to adjust these points so they become aligned horizontally. What do I mean by that? If you notice the area of the sub-image above, it is a little rotated: the points on right side of the image are positioned a little higher than points on the other side.

In other words, we have image A, which was exaggerated on purpose to look a little more distorted/rotated than reality, and then image B - which is what I would like as the final result of this procedure:

A) bad_rect B) ok_rect

I'm not sure which techniques could be used to achieve this transformation. The application also needs to detect automatically how much rotation needs to be done, as I don't have control over the image acquisition procedure.

The purpose is to have a new Mat with the normalized sub-image. I'm not worried about a possible image distortion right now, I'm just looking for a way to identify how much rotation needs to be done on the sub-image and how to apply it and get a more rectangular area.

score 6 · Accepted Answer · answered Oct 18 '11 at 00:02

6

I think http://felix.abecassis.me/2011/10/opencv-rotation-deskewing/ and http://felix.abecassis.me/2011/10/opencv-bounding-box-skew-angle/ will come in handy. The aforementioned posts don't cover perspective warping (only rotation). To get the best results, you'll have to use warpPerspective (maybe in conjunction with getRotationMatrix2D). Use the angles between line segments to find out how much you need to warp the perspective. THe assumption here is that they should always be 90 degrees and that the closest one to 90 degrees is the "closest" vector as far as the perspective is concerned.

Don't forget to normalize your vectors!

answered Oct 18 '11 at 00:02

David Titarenco

32,662
13
66
111

1

+1 and add that OP may want to read up on [camera calibration](http://opencv.willowgarage.com/documentation/camera_calibration_and_3d_reconstruction.html). This isn't exactly calibration, but understanding the projective geometry involved is very helpful. – dantswain Oct 18 '11 at 01:22
1

Those are awesome links @David, thanks! I've implemented and tested those techniques and I'm sticking with a simple `warpPerspective()` operation for now for the sake of performance, but I know that eventually I'll go back to use a proper deskewing method. [This link](http://nuigroup.com/forums/viewthread/3414/) also helped me a lot. – karlphillip Oct 20 '11 at 03:53

score 1 · Answer 2 · answered Oct 18 '11 at 00:52

1

It's called Keystone correction, or keystoning. It transforms a shape that looks like a trapezoid into a rectangle.

Book Scan Wizard program offers techniques to correct this artifact, you may want to check it out.

answered Oct 18 '11 at 00:52

nguyenq

8,212
1
16
16

Affine Transform, Simple Rotation and Scaling or something else entirely?

2 Answers2

Linked