1

A popular question but for which I can't find a definitive answer. Math wizards we need you!

The Problem is the following

We have a unmodified photograph (not cropped) of a rectangular object taken from some angle. We don't know the real size of the rectangle or any info about that picture.

GOAL : We want to find the aspect ratio / proportions of the rectangle with only this image as input.

This question isn't linked to a particular programming language (pseudo code allowed) but I want to implement the solution in Python using NumPy.

Example image with a paper sheet in perspective

In this photograph it's an A4 paper sheet so the code/algorithm should find a ratio of 1.4142 : 1

My Questions

  1. What is the best solution (See Info section below), handling edges cases ? Is there another one I not listed ?
  2. Is there an implementation (C++, Python, ...) in a maintained library package (OpenCV, SciPy, ...) ?

Info

After digging the internet it's seems that this problem isn't easy to solve, there is some info :

Others StackOverflow related questions

There is some questions on StackOverflow with the same topic/subject/goal:

BUT the answers seems to differ in the implementation(even though they claim to be based on the same research paper) and be incomplet (not always handling parallel opposite sides).

My current working implementations

Based on all the info above I implemented the two methods in Python with OpenCV & NumPy:

Method #1

def compute_aspect_ratio(image, corners):
    # Based on :
    # - https://andrewkay.name/blog/post/aspect-ratio-of-a-rectangle-in-perspective/

    # Step 1: Get image center, will be used as origin
    h, w = image.shape[:2]
    origin = (w * .5, h * .5)
    
    # Step 2: Points coords from image origin
    # /!\ CAREFUL : points need to be in zig-zag order (A, B, D, C)
    a = corners[0] - origin
    b = corners[1] - origin
    c = corners[3] - origin
    d = corners[2] - origin
    
    # Step 3: Check if the camera lie into the plane of the rectangle
    # Coplanar if three points are collinear, in that case the aspect ratio cannot be computed
    M = numpy.array([[b[0], c[0], d[0]], [b[1], c[1], d[1]], [1., 1., 1.]])
    det = numpy.linalg.det(M)
    if math.isclose(det, 0., abs_tol=.001):
        # Cannot compute the aspect ratio, the caller need to check if the return value is 0.
        return 0.

    # Step 4: Create the matrixes
    A = numpy.array([[1., 0., -b[0], 0., 0., 0.],
                     [0., 1., -b[1], 0., 0., 0.],
                     [0., 0.,    0., 1., 0., -c[0]],
                     [0., 0.,    0., 0., 1., -c[1]],
                     [1., 0., -d[0], 1., 0., -d[0]],
                     [0., 1., -d[1], 0., 1., -d[1]]], dtype=float)
    B = numpy.array([[b[0]-a[0]],
                     [b[1]-a[1]],
                     [c[0]-a[0]],
                     [c[1]-a[1]],
                     [d[0]-a[0]],
                     [d[1]-a[1]]], dtype=float)

    # Step 5: Solve it, this will give us [Ux, Uy, (Uz / λ), Vx, Vy, (Vz / λ)]
    s = numpy.linalg.solve(A, B)

    # Step 6: Compute λ, it's the focal length
    l = 0.
    l_sq = ((-(s[0] * s[3]) - (s[1] * s[4])) / (s[2] * s[5]))
    if l_sq > 0.:
        l = numpy.sqrt(l_sq)
    # If l_sq <= 0, λ cannot be computed, two sides of the rectangle's image are parallel
    # Either Uz and/or Vz is equal zero, so we leave l = 0

    # Step 7: Get U & V
    u = numpy.linalg.norm([s[0], s[1], (s[2] * l)])
    v = numpy.linalg.norm([s[3], s[4], (s[5] * l)])

    return (v / u)

Method #2

def compute_aspect_ratio(image, corners):
    # Based on :
    # - https://www.microsoft.com/en-us/research/publication/2016/11/Digital-Signal-Processing.pdf
    # - http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf

    # Step 1: Compute image center (a.k.a origin)
    h, w = image.shape[:2]
    origin = (w * .5, h * .5)

    # Step 2: Make homeneous coordinates
    # /!\ CAREFUL : points need to be in zig-zag order (A, B, D, C)
    p1 = numpy.array([*corners[0], 1.])
    p2 = numpy.array([*corners[1], 1.])
    p3 = numpy.array([*corners[3], 1.])
    p4 = numpy.array([*corners[2], 1.])

    k2 = numpy.dot(numpy.cross(p1, p4), p3) / numpy.dot(numpy.cross(p2, p4), p3)
    k3 = numpy.dot(numpy.cross(p1, p4), p2) / numpy.dot(numpy.cross(p3, p4), p2)

    # Step 3: Computing U & V vectors, but at this point the z value of these vectors are in the form z / f
    # Where f is the focal length
    u = (k2 * p2) - p1
    v = (k3 * p3) - p1

    # Step 4: Unpack vectors to avoid using accessors
    uX, uY, uZ = u
    vX, vY, vZ = v

    # Step 5: Check if two opposite sides are parallel (or almost parallel)
    if math.isclose(uZ, .0, abs_tol=.01) or math.isclose(vZ, .0, abs_tol=.01):
        aspect_ratio = numpy.sqrt((vX ** 2 + vY ** 2) / (uX ** 2 + uY ** 2))
        return aspect_ratio

    # Step 6: Compute the focal length
    f = numpy.sqrt(numpy.abs((1. / (uZ * vZ)) * ((uX * vX - (uX * vZ + uZ * vX) * origin[0] + uZ * vZ * origin[0] * origin[0]) + (uY * vY - (uY * vZ + uZ * vY) * origin[1] + uZ * vZ * origin[1] * origin[1]))))

    A = numpy.array([[f, 0., origin[0]], [0., f, origin[1]], [0., 0., 1.]]).astype('float32')
    Ati = numpy.linalg.inv(numpy.transpose(A))
    Ai = numpy.linalg.inv(A)

    # Step 7: Calculate the real aspect ratio
    aspect_ratio = numpy.sqrt(numpy.dot(numpy.dot(numpy.dot(v, Ati), Ai), v) / numpy.dot(numpy.dot(numpy.dot(u, Ati), Ai), u))

    return aspect_ratio

If you have any info / help don't hesite to share!

Ben Souchet
  • 1,450
  • 6
  • 20
  • without knowing the focal length, the ratio could be arbitrary. unsolvable. not enough information in a single image. -- now if you HAD the focal length... then perhaps. -- you can try this by using OpenCV's `solvePnP` given some quad, a model of some aspect ratio, and a varying focal length. the result is some rvec and tvec (pose). draw the model given all information and see how it matches. – Christoph Rackwitz Dec 19 '22 at 01:05
  • @ChristophRackwitz If you look at the research papers I linked (or the others StackOverflow questions I linked), it possible to determine the focal length and find the aspect ratio. My question is what is the proper implementation or is there alternatives. – Ben Souchet Dec 19 '22 at 10:23
  • The paper "Aspect Ratio..." givesd you the solution, based on the resolution of a linear syste. What more do you want ? –  Dec 19 '22 at 10:35
  • @YvesDaoust I'm not good at reading math and even less to transcribe these equations in Python, my current code (shared on my question) is based on others implementations (linked StackOverflow questions), I looking to ensure my Python code is the correct implementation of the solution described. – Ben Souchet Dec 19 '22 at 11:27
  • Better post on Code Review, then. –  Dec 19 '22 at 13:20

1 Answers1

0

After intensively reading the research papers (links in the question), here it's my implementation based on both methods I shared. Enjoy!

Optimized working implementation (Python3 & NumPy)


def compute_aspect_ratio(image, corners):
    # Based on :
    # - https://www.microsoft.com/en-us/research/publication/2016/11/Digital-Signal-Processing.pdf
    # - http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf
    # - https://andrewkay.name/blog/post/aspect-ratio-of-a-rectangle-in-perspective/

    # Step 1: Get image center, will be used as origin
    h, w = image.shape[:2]
    origin = (w * .5, h * .5)

    # Step 2: Homeneous points coords from image origin
    # /!\ CAREFUL : points need to be in zig-zag order (A, B, D, C)
    p1 = numpy.array([*(corners[0] - origin), 1.])
    p2 = numpy.array([*(corners[1] - origin), 1.])
    p3 = numpy.array([*(corners[3] - origin), 1.])
    p4 = numpy.array([*(corners[2] - origin), 1.])

    # Step 3: Zhengyou Zhang p.10 : equations (11) & (12)
    k2 = numpy.dot(numpy.cross(p1, p4), p3) / numpy.dot(numpy.cross(p2, p4), p3)
    k3 = numpy.dot(numpy.cross(p1, p4), p2) / numpy.dot(numpy.cross(p3, p4), p2)

    # Step 4: Compute the focal length
    f = 0.
    f_sq = -((k3 * p3[1] - p1[1]) * (k2 * p2[1] - p1[1]) + \
             (k3 * p3[0] - p1[0]) * (k2 * p2[0] - p1[0]) ) / ((k3 - 1) * (k2 - 1))
    if f_sq > 0.:
        f = numpy.sqrt(f_sq)
    # If l_sq <= 0, λ cannot be computed, two sides of the rectangle's image are parallel
    # Either Uz and/or Vz is equal zero, so we leave l = 0

    # Step 5: Computing U & V vectors, BUT the z value of these vectors are in the form: z / f
    # Where f is the focal length
    u = (k2 * p2) - p1
    v = (k3 * p3) - p1

    # Step 6: Get U & V
    u = numpy.linalg.norm([u[0], u[1], (u[2] * f)])
    v = numpy.linalg.norm([v[0], v[1], (v[2] * f)])

    return (v / u)

Same version without comments:

def compute_aspect_ratio(image, corners):
    h, w = image.shape[:2]
    origin = (w * .5, h * .5)

    p1 = numpy.array([*(corners[0] - origin), 1.])
    p2 = numpy.array([*(corners[1] - origin), 1.])
    p3 = numpy.array([*(corners[3] - origin), 1.])
    p4 = numpy.array([*(corners[2] - origin), 1.])

    k2 = numpy.dot(numpy.cross(p1, p4), p3) / numpy.dot(numpy.cross(p2, p4), p3)
    k3 = numpy.dot(numpy.cross(p1, p4), p2) / numpy.dot(numpy.cross(p3, p4), p2)

    f = 0.
    f_sq = -((k3 * p3[1] - p1[1]) * (k2 * p2[1] - p1[1]) + \
             (k3 * p3[0] - p1[0]) * (k2 * p2[0] - p1[0]) ) / ((k3 - 1) * (k2 - 1))
    if f_sq > 0.:
        f = numpy.sqrt(f_sq)

    u = (k2 * p2) - p1
    v = (k3 * p3) - p1

    u = numpy.linalg.norm([u[0], u[1], (u[2] * f)])
    v = numpy.linalg.norm([v[0], v[1], (v[2] * f)])

    return (v / u)

Handling parallel sides

Before calling this method you can check if the paper deformation is negligible to avoid all these computations:

PAPER_DEFORMATION_TOLERANCE = 0.01
def compute_paper_size(image, corners):
    # Vectors of the side of the contour (clockwise)
    side_top_vec = corners[1] - corners[0]
    side_rgh_vec = corners[2] - corners[1]
    side_btm_vec = corners[2] - corners[3]
    side_lft_vec = corners[3] - corners[0]

    # Step 1: Compute average width & height of the paper sheet
    paper_avg_width = 0.5 * (numpy.linalg.norm(side_top_vec) + numpy.linalg.norm(side_btm_vec))
    paper_avg_height = 0.5 * (numpy.linalg.norm(side_lft_vec) + numpy.linalg.norm(side_rgh_vec))

    # Step 2: If deformation is negligible avoid computation and return the average dimensions
    #         Checking if the opposite sides are parallel
    if math.isclose((side_top_vec[0] * side_btm_vec[1]), (side_top_vec[1] * side_btm_vec[0]), abs_tol=PAPER_DEFORMATION_TOLERANCE) and \
        math.isclose((side_lft_vec[0] * side_rgh_vec[1]), (side_lft_vec[1] * side_rgh_vec[0]), abs_tol=PAPER_DEFORMATION_TOLERANCE):
        return (paper_avg_width, paper_avg_height)

    # Step 3: Compute aspect ratio
    aspect_ratio = compute_aspect_ratio(image, corners)

    return (paper_avg_width, paper_avg_width * aspect_ratio)
Ben Souchet
  • 1,450
  • 6
  • 20