Deprojection from 2D to 3D is wrong regardless of method

Question

I am trying to deproject a 2D image point to a 3D point, and there seems to be just as much misinformation as there is information out there on how to do it. I have the problem modeled in UE4, where I know that:

Camera Location: (1077,1133,450)

Camera Rotation (degrees): yaw = 90, pitch = 345, roll=0

Camera Horizontal FOV (degrees): 54.43224

Camera Vertical FOV (degrees): 32.68799

Camera Shear: 0

Camera Resolution: 1280x720

Object Location in world: (923,2500,0)

Object Location in image frame: (771,427)

From the above data and this method, I have intrinsic camera matrix:

K = [[1.24444399e+03 0.00000000e+00 6.40000000e+02]
     [0.00000000e+00 1.22760403e+03 3.60000000e+02]
     [0.00000000e+00 0.00000000e+00 1.00000000e+00]]

And rotation matrix:

R = [[ 5.91458986e-17 -1.00000000e+00 -1.58480958e-17]
     [ 9.65925826e-01  6.12323400e-17 -2.58819045e-01]
     [ 2.58819045e-01  0.00000000e+00  9.65925826e-01]]

which I verified with this tool.

I first tried to do this without using the intrinsic camera matrix using this method but that did not work because the intrinsic parameters are, in fact, required

I next attempted this solution, which I implemented in python, removing the code that would have calculated the intrinsic and extrinsic parameters I already have:

def deproject_pixel(u,v):
   inv_camera_rotation_matrix = np.linalg.inv(camera_rotation_matrix)
   inv_camera_intrinsic_matrix = np.linalg.inv(K)

   uv_matrix = np.array([
       [u],
       [v],
       [1]])

   tx = 1077
   ty = 1133
   tz = 450

   txyz_matrix = np.array([
       [-tx],
       [-ty],
       [-tz]])

   Tx_Ty_Tz = np.dot(camera_rotation_matrix,txyz_matrix)

   ls = inv_camera_rotation_matrix @ inv_camera_intrinsic_matrix @ uv_matrix
   rs = inv_camera_rotation_matrix @ Tx_Ty_Tz

   s = rs[2][0] / ls[2][0]
   world_point = s*ls-rs

   return world_point

I believe the above code is equivalent to this coded solution in C++ but perhaps I made a mistake?

Running deproject_pixel(771,427) returns (929,1182,0), which is close in X but very far off in Y

I then tried another implementation that requires the full camera matrix M:

M = K@np.concatenate((camera_rotation_matrix,Tx_Ty_Tz),1)
def deproject_pixel_2(u,v):
   A = (M[0][1]-M[2][1]*u)/(M[1][1]-M[2][1]*v)
   B = (M[2][0]*u-M[0][0])/(M[2][0]*v-M[1][0])

   X = ((M[1][3]-M[2][3]*v)*A-M[0][3]+M[2][3]*u ) / (A*(M[2][0]*v-M[1][0])-M[2][0]*u+M[0][0])
   Y = (M[0][3]-M[2][3]*u-B*(M[1][3]-M[2][3]*v)) / (B*(M[1][1]-M[2][1]*v)-M[0][1]+M[2][1]*u)

   world_point = [X,Y,0]
   return world_point

But once again, running deproject_pixel_2(771,427) returns (929,1182,0), which is close in X but very far off in Y

Can anyone please point out what I am doing wrong here? Is a matrix calculation incorrect? Are both of these implementations simultaneously wrong in the same way?

UPDATE 1 I moved the camera system to zero and removed rotation. I can now work out rotation offsets if I rotate along a single axis, but combining multiple axes of rotation changes what the offsets need to be, so I can now properly deproject only if there is a single axis of rotation. Suggestions? I have learned that Unreal may deal with rotation differently than standard notation suggests.

Further reading:

Transforming 2D image coordinates to 3D world coordinates with z = 0

OpenCV unproject 2D points to 3D with known depth `Z`

Get 3D coordinates from 2D image pixel if extrinsic and intrinsic parameters are known

https://answers.opencv.org/question/62779/image-coordinate-to-world-coordinate-opencv/

http://ksimek.github.io/2013/08/13/intrinsic/

maybe start with more simple setups, like camera in (0,0,0) without rotation. There are a lot of possible mistakes, includes mismatch of coordinate system directions, inversions of transformation matrices, etc. — Micka, Dec 09 '19 at 20:09
Thank you for the suggestion. The programs do not work correctly in these circumstances either, but at least I can look at the behavior more easily under these conditions @Micka — Andrew Carluccio, Dec 09 '19 at 20:32
How are you planning on getting depth information? Image coordinates give you an azimuth and elevation relative to the camera. Unless you have an idea of what the depth is, you physically/mathematically can't deproject in a meaningful manner — Mad Physicist, Dec 10 '19 at 22:10
I know the Z coordinate of the object (it is on the Z=0 plane) @MadPhysicist — Andrew Carluccio, Dec 10 '19 at 22:13
Then you're just mapping from one plane to another. I'll take a look once I get home — Mad Physicist, Dec 10 '19 at 22:14
Yep @MadPhysicist I just need a mapping from the image plane of the camera to the location on the Z=0 plane. Thanks in advance for looking when you get a chance. — Andrew Carluccio, Dec 10 '19 at 22:16

score 1 · Accepted Answer · answered Dec 11 '19 at 06:12

It's just UE4 being odd about the rotation matrix. I modeled the problem in Blender 2.79 and everything works perfectly if you rotate the camera with ZYX (ypr) and add a 180 degree offset to the roll (so 180-reported_angle) in the code.

Upshot is know your engine and how it does or does not conform to published standards!

Deprojection from 2D to 3D is wrong regardless of method

1 Answers1