Estimate position of point given world space coordinates and image coordinates

Question

I am trying to detect a chessboard using 8 aruco markers, opencv and python. The marker detection works fine, but as soon as a player makes a move, at least one of the markers will usually be covered by their arm. Since most of the markers can still be detected, an estimation of the point given the position of the other markers should be possible. To illustrate my setup I have linked a picture. Correct Marker Points

My first attempts to predict a missing point were to try to compute the unknown transition matrix from world to image space. To represent the 8 marker corner positions the world space coordinates [1,0,0], [1,50,0], [1,75,0], [1,100,0], [1,0,100], [1,0,50], [1,75,100] and [1,100,100] were used. These are therefore always known and represented by matrix W. The screen space coordinates of the marker points are computed by opencv and represented by matrix S. For the sake of argument, lets pretend one marker was not detected and the point needs to be estimated. The transformation matrix from W to S (i.e. solving W * X = S for X) was then computed for the given 7 points and to estimate the missing point the world space coordinates were multiplied with X. The problem is that X does not incorporate the perspective transformation and therefore incorrectly projects an estimated point. To illustrate this a second picture is linked where all points were correctly detected, but are then projected by the projection matrix X. Incorrect Marker Points

A quick snippet of python code which shows how X is computed and points projected:

ids = [81,277,939,275,683,677,335,981]

corner_world_coord = {
    683: [1,0,0],
    275: [1,50,0],
    939: [1,75,0],
    81: [1,100,0],
    335: [1,0,100],
    677: [1,50,100],
    277: [1,75,100],
    981: [1,100,100]
}

W = [corner_world_coord[i] for i in ids]
S = [aruco_corners[i] for i in ids]

X, res, _, _ = np.linalg.lstsq(W,S)

estimate = np.zeros(len(ids))

for idx, corner in enumerate(W):
    estimate[idx] = np.dot(corner,X)

The residual of the least square error computation of X is always equal to 0. My question therefore is, is there a way to compute the screen coordinates of a missing point, given the world space and screen space coordinates of multiple other points?

Aruco returns also the rotation `rvec` and translation `tvec` vectors that allows to transform a 3D point expressed in the object frame to the camera frame. Each marker point is expressed in a global object frame. If a marker is not detected, you can transform the marker coordinate using one of the other marker pose and then project it to the image plane using the intrinsic parameters. — Catree, Feb 26 '18 at 21:35
As you have multiple markers, you will have multiple `rvec` and `tvec`. Maybe you will need to transform everything in a reference frame / global object frame. See homogeneous transformation and the documentation for more information about [camera frame](https://docs.opencv.org/3.4.1/d9/d0c/group__calib3d.html#details) and [homogeneous transformation](https://docs.opencv.org/3.4.1/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d). — Catree, Feb 26 '18 at 21:38

score 0 · Answer 1 · answered Feb 27 '18 at 23:12

I was able to find a solution under the following question: How to draw a Perspective-Correct Grid in 2D

In this case, 4 non-collinear 2D points are required from your image world and image space. I.e. remove the ones from the world coordinates to obtain [0,0], [50,0], [75,0], [100,0], [0,100], [50,100], [75,100] and [100,100]. Non-collinear is probably not the correct term, but what is meant by that is that they need to create a quadrilateral and at most 2 points are allowed to lie on the same line. The x coordinates of these 4 points we call x1...x4 and the y coordinates y1...y4. The coordinates of the corresponding image space points we call x1p...x4p and y1p...y4p (p stands for prime). The computation of the perspective correct transition matrix is then given in code below:

def compute_proj_matrix(self, world_points, image_points):
    # compute A * C = B 
    # A is the following 8x8 Matrix:
    # x1   y1     1     0   0    0   -x1*x1'  -y1*x1'
    # 0    0     0    x1   y1   1   -x1*y1'  -y1*y1'
    # x2   y2     1     0   0    0   -x2*x2'  -y2*x2'
    # 0    0     0    x2   y2   1   -x2*y2'  -y2*y2'
    # x3   y3     1     0   0    0   -x3*x3'  -y3*x3'
    # 0    0     0    x3   y3   1   -x3*y3'  -y3*y3'
    # x4   y4     1     0   0    0   -x4*x4'  -y4*x4'
    # 0    0     0    x4   y4   1   -x4*y4'  -y4*y4'
    # B = [x1p,y1p,x2p,y2p,x3p,y3p,x4p,y4p]
    x1,x2,x3,x4 = world_points[:,0]
    y1,y2,y3,y4 = world_points[:,1]
    x1p,x2p,x3p,x4p = image_points[:,0]
    y1p,y2p,y3p,y4p = image_points[:,1]
    A = np.array([
        [x1,y1, 1, 0, 0, 0, -x1*x1p, -y1*x1p],
        [ 0, 0, 0,x1,y1, 1, -x1*y1p, -y1*y1p],
        [x2,y2, 1, 0, 0, 0, -x2*x2p, -y2*x2p],
        [ 0, 0, 0,x2,y2, 1, -x2*y2p, -y2*y2p],
        [x3,y3, 1, 0, 0, 0, -x3*x3p, -y3*x3p],
        [ 0, 0, 0,x3,y3, 1, -x3*y3p, -y3*y3p],
        [x4,y4, 1, 0, 0, 0, -x4*x4p, -y4*x4p],
        [ 0, 0, 0,x4,y4, 1, -x4*y4p, -y4*y4p]])
    B = np.array([x1p,y1p,x2p,y2p,x3p,y3p,x4p,y4p])
    return np.linalg.solve(A,B)

Mapping of new (in the above case, missing) point is then done by:

def map_point(self, proj_matrix, point):
    x,y = point
    factor = 1.0/(proj_matrix[6] * x + proj_matrix[7] * y + 1.0)
    projected_x = factor * (proj_matrix[0] * x + proj_matrix[1] * y + proj_matrix[2])
    projected_y = factor * (proj_matrix[3] * x + proj_matrix[4] * y + proj_matrix[5])
    return np.array([projected_x,projected_y])

Why and how this works can best be checked in the question linked above, as frankly I do not understand myself and am just happy to have found a solution.

Estimate position of point given world space coordinates and image coordinates

1 Answers1