2D Coordinate to 3D world coordinate（solvepnp）

Question

Converting 2D Pixels to World Coordinates

I want to convert some 2d pixels in the picture captured by the camera to the world coordinates on the ground. We have a target tracking project, and I need to turn the 2d pixel track of the target tracking object to the ground. I achieve this by the following steps:

Calibrate the camera to get internal parameters
Use solvePnP to obtain external parameters
Define Zconst=0
Calculate ( s )
Substitute the parameters 1-4 into the pinhole imaging mathematical model（Computing x,y coordinate (3D) from image point） to find the world coordinates ( X, Y, 0)

Python Code

import cv2
import numpy as np

def load_parameters():
    with np.load('./scaling.npz') as data:
        intrinsics_matrix = data['intrinsics_matrix']
        distortion_coeffs = data['distortion_coeffs']
    return intrinsics_matrix, distortion_coeffs

def main():
    # Read the image
    sourceImage = cv2.imread(r"1.jpg")
    gray = cv2.cvtColor(sourceImage, cv2.COLOR_BGR2GRAY)

    # Detect chessboard corners
    pattern_size = (8, 6)
    found, corners = cv2.findChessboardCorners(gray, pattern_size)
    if not found:
        print("Unable to find chessboard corners. Please check the image and pattern size.")
    else:
        # Arrange corner points in clockwise direction
        boxPoints = np.array(
            [corners[0][0], corners[pattern_size[0] - 1][0], corners[-1][0], corners[-pattern_size[0]][0]],
            dtype=np.float32)
        worldBoxPoints = np.array([[0, 0, 0], [-0.14, 0, 0], [-0.14, 0.1, 0], [0.10, 0, 0]], dtype=np.float32)  # World coordinates
        print("Corner coordinates:\\n", boxPoints)
        print("World coordinates:\\n", worldBoxPoints)
        
        # Camera intrinsics matrix and distortion coefficients
        cameraMatrix, distCoeffs = load_parameters()  # Assuming this is a function returning camera parameters

        # Use solvePnP to get rotation and translation vectors
        _, rvec, tvec = cv2.solvePnP(worldBoxPoints, boxPoints, cameraMatrix, distCoeffs)

        # Convert rotation vector to rotation matrix
        rotationMatrix, _ = cv2.Rodrigues(rvec)

        # Known Z coordinate
        Zconst = 0
        # Choose an index of a corner point (e.g., index of the first corner is 0)
        index = 3
        # Print image coordinates of the selected corner
        print(f"Image Coordinates of corner at index {index}: x={boxPoints[index][0]}, y={boxPoints[index][1]}")
        # Print world coordinates of the selected corner
        print(f"World Coordinates of corner at index {index}: x={worldBoxPoints[index][0]}, y={worldBoxPoints[index][1]}, z={worldBoxPoints[index][2]}")
        
        # Calculate s (as per the principle)
        uvPoint = np.array([boxPoints[index][0], boxPoints[index][1], 1])
        tempMat = np.linalg.inv(cameraMatrix @ rotationMatrix - uvPoint.reshape(-1, 1) @ tvec.reshape(1, -1))
        s = Zconst + np.dot(tempMat[-1, :], tvec)
        s /= np.dot(tempMat[-1, :], uvPoint)

        # Calculate world coordinates (as per the principle)
        worldCoord = rotationMatrix.T @ (np.linalg.inv(cameraMatrix) @ (s * uvPoint) - tvec)

        # Convert worldCoord to a 1x3 vector
        worldCoord = worldCoord.flatten()

        # Print output
        print(f"World Coordinates of corner at index {index}: x={worldCoord[0]}, y={worldCoord[1]}, z={worldCoord[2]}")

Output Result:

Corner Coordinates:

[[241.94821 258.21136]
 [397.4638  261.28043]
 [395.54108 373.2134 ]
 [239.29053 369.498  ]]

World Coordinates (in cm, consistent with the calibration board's unit length of 2 cm, 9x7):

[[ 0.    0.    0.  ]
 [-0.14  0.    0.  ]
 [-0.14  0.1   0.  ]
 [ 0.1   0.    0.  ]]

Image Coordinates of corner at index 3:

x=239.29052734375, y=369.49798583984375

World Coordinates of corner at index 3:

x=0.0035804089590803106, y=0.0018645673272446915, z=-0.006464698022980075

Analysis:

The result for World Coordinates of corner at index 3 is evidently incorrect. The expected result for index=3, corresponding to the third corner point, should be [-0.14 0.1 0. ] in world coordinates. The corner coordinates were captured in clockwise order using the OpenCV function findChessboardCorners, and the world coordinates were also defined in pixel corner order. The positive x-direction is to the left, and the positive y-direction is downward (right-handed coordinate system).