if you use gfx pipeline where positions (w=1.0
) and vectors (w=0.0
) are transformed to NDC like this:
(x',y',z',w') = M*(x,y,z,w) // applying transforms
(x'',y'') = (x',y')/w' // perspective divide
where M
are all your 4x4 homogenyuous transform matrices multiplied in their order together. If you want to go back to the original (x,y,z)
you need to know w'
which can be computed from z
. The equation depends on your projection. In such case you can do this:
w' = f(z') // z' is usually the value encoded in depth buffer and can obtained
(x',y') = (x'',y'')*w' // screen -> camera
(x,y) = Inverse(M)*(x',y',z',w') // camera -> world
However this can be used only if you know the z'
and can derive w'
from it. So what is usually done (if we can not) is to cast ray from camera focal point through the (x'',y'')
and stop at wanted perpendicular distance to camera. For perspective projection you can look at it as triangle similarity:

So for each vertex you want to transform you need its projected x'',y''
position on the znear
plane (screen) and then just scale the x'',y''
by the ratio between distances to camera focal point (*z1/z0
). Now all we need is the focal length z0
. That one dependss on the kind of projection matrix you use. I usually encounter 2 versions when you are in camera coordinate system then point (0,0,0)
is either the focal point or znear plane. However the projection matrix can be any hence also the focal point position can vary ...
Now when you have to deal with aspect ratio then the first method deals with it internally as its inside the M
. The second method needs to apply inverse of aspect ratio correction before conversion. So apply it directly on x'',y''