1

I am coding a program in OpenCV where I want to adjust camera position. Is there any metric in OpenCV to measure the amount of perspective in two images? How can a homography be used to quantify the degree of perspective in two images? The method that comes to my mind is to run edge detection and compare the parallel edge sizes but that method is prone to errors.

enter image description here

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
user0193
  • 288
  • 1
  • 7
  • 1
    a single metric for perspective transformation could be approximated to degree of transformation matrix – gfdsal Jun 13 '21 at 22:24
  • 1
    Is this related with [this problem](https://stackoverflow.com/questions/66912832/how-do-i-use-warpperspective-correctly/66915842#66915842) ? – Yunus Temurlenk Jun 14 '21 at 05:56
  • @YunusTemurlenk yes +1, so in that problem, the cards were merely rotated and little perspective transfromed. My concerned is to compare two perspective images of something like in example in question above, and say which one is better for view. The one which is least perspective is better offcourse. So I am looking for a metric to quantify this, so I can select the image. My concern is not to fix, but just to quantify the perspectiveness. – user0193 Jun 14 '21 at 14:17
  • 1
    I had a somewhat related problem, a long, long time ago, to find the width/height ratio of a perspective-distorted rectangle. Maybe this could be of use? https://stackoverflow.com/q/1194352/145999 – HugoRune Jun 14 '21 at 16:17
  • @HugoRune, yes thats great +1. my problem is lot simpler, to select one of the two images that is less perspectively distorted. – user0193 Jun 14 '21 at 20:05

1 Answers1

1

As a first solution I'd recommend maximizing the distance between the image of the line at infinity and the center of your picture.

Identify at least two pairs of lines that are parallel in the original image. Intersect the lines of each pair and connect the resulting points. Best do all of this in homogeneous coordinates so you won't have to worry about lines being still parallel in the transformed version. Compute the distance between the center of the image and that line, possibly taking the resolution of the image into account somehow to make the result invariant to resampling. The result will be infinity for an image obtained from a pure affine transformation. So the larger that value the closer you are to the affine scenario.

MvG
  • 57,380
  • 22
  • 148
  • 276
  • thanks a lot!. Unfortunately finding parallel lines, without prior knowledge will be difficult so all images shall be perspectively warped by the camera, So in case of the book above, we do know that its rectangle, but i reckon it will be challenging to do in code as all i have is lines using hough transform for the book. In the left perspective image above for the book, only one pair of lines are closely parallel in the image. – user0193 Jun 14 '21 at 20:20
  • one more thing, did you mean perspective transformation or affine transformation in the answer above? – user0193 Jun 14 '21 at 20:21
  • 1
    @user0193 you need to have *some* information about the real world object depicted. Do you always have one image of the object with no perspective distortion? In that case you might [compute the transformation matrix](https://math.stackexchange.com/a/339033/35416) based on four matching detected features, then transform the line at infinity with that. In [a different context](https://math.stackexchange.com/a/295409/35416) I could use congruent irregular polygons to infer the line at infinity, but that seems like a very special case. I meant "affine" as "not projective" in my last sentence. – MvG Jun 15 '21 at 06:35
  • Awsome. To answer the question, i will always have images taken by cameras so pretty much all will have perspective distortion. The task is merely to rank them based on how much perspective is there in them. Usually images that have less perspective distortion provide more information on the objects they portray. So to compare the two images I have above, the one on right is more informative as it is less perspective and nearly isometric, while on the right its highly perspective. So the metric, say 0-10 would quantify this, so i could assign 8 to the image on the left and 1 to the right – user0193 Jun 15 '21 at 11:37
  • Mathematically speaking the left image could be an undistorted picture of an irregular quad. You need information about the real world to know that this is unlikely. If you want to go for "more information" you might combine both pictures, and always read information from the source that has more pixels per area. Presenting the combined image in a reasonable way would be tricky, as would be dealing with the fact that at a given point you could have high resolution in one direction but very low in the orthogonal direction, i.e. the images of pixels in the other picture would be very thin. – MvG Jun 16 '21 at 05:47