0

Let's assume I have "look-at" and "perspective" matrices. Now I need to compute a world 3D-point from a corresponding on-camera 2D-point and the distance between the camera and the former.

AFAIK, the reverse problem can be solved easily: point2D = lookAt * perspective * point3D. This means that point3D = (lookAt * perspective)^-1 * point2D. But it's not clear for me where the distance is to be applied here, and what are the "additional" values of the points, i.e. what are the values designated with the question marks (x, y, z, ?) and (x, y, ?, ?). I guess those values (or some of them) can be derived from the distance?

Or maybe this task can't be reversed this way? If so how can I solve it without diving too deep into geometry?

ababo
  • 1,490
  • 1
  • 10
  • 24
  • that will not work as perspective transform also includes perspective division which is not present in the matrix itself. So you would need to undo the division first based on the distance.... easier is to extract camera focal point and FOV and just create the 3D point by shifting its 2D counterpart projected on Znear plane by cating a ray from focal point to your 2D points and set the vector length to match distance from camera... see [vertex shader in here](https://stackoverflow.com/a/45140313/2521214) it basicaly do the same – Spektre Aug 04 '21 at 07:09

2 Answers2

1

I have no idea what you call the "look-at" and "perspective" matrices.

This said, assuming a coordinate frame attached to the camera, such that the optical axis is along Z, the optical center is the origin and the imaging plane is at a distance f expressed in pixels from it, we have the relation

f X/Z = i - w/2
f Y/Z = j - h/2

where X, Y, Z are spatial coordinates, i, j are pixel indexes, and w, h denote the image width and height in pixels.

From this, we draw the distance to the camera via

f²D²/Z² = f²(X²+Y²+Z²)/Z² = (i - w/2)²+(j-h/2)²+f²

giving

Z = fD/√((i - w/2)²+(j-h/2)²+f²)
X = Z/f (i - w/2)
Y = Z/f (j - h/2)

You can transform these camera-relative coordinates to your world-coordinates by an affine transformation.

Note that "distance to the camera" can be understood in different ways. It can be the true Euclidean distance or the distance to the plane that contains the point.

  • By "look-at" and "perspective" matrices I mean the matrices I get after calling functions equivalent to "perspectiveRH()" and "lookAtRH()" (pretty conventional names in different libs). – ababo Aug 04 '21 at 08:45
0

I seriously doubt that

(lookAt * perspective)^-1

exists. You cannot invert lookAt, it's a projection matrix, so it has one dimensional kernel and thus it has zero determinant.

Since there is no specific information about these matrices, which could help me figure out a more geometric solution, just use brute force calculations. Set up the system of linear equations

A = lookAt * perspective
y = 2Dpoint
x = 3Dpoint

A * x = y 

You know y so solve for x, by say using Gaussian elimination. You will obtain a one parameter family of solutions x = x0 + t*u0. Then, take the vector e4 = [0,0,0,1] and set up the equation

|perspective*(x0 + t*u0) - e4|^2 = (distance_between_x_and_camera)^2

for the unknown variable t. This is a quadratic equation, so it is straightforward to solve. When you find the solution $t0$ (there will be two of them, pick the one that makes the 3D point closer to the 2D point) you get your position

3Dpoint = x = x0 + t0*u0
Futurologist
  • 1,874
  • 2
  • 7
  • 9