glReadPixels() how to get actual depth instead of normalized values?

Question

I'm using pyopengl to get a depth map.

I am able to get a normalized depth map using glReadPixels(). How can I revert the normalized values to the actual depth in world coordinates?

I've tried playing with glDepthRange(), but it always performs some normalization. Can I disable the normalization at all?

Possible duplicate of [depth buffer got by glReadPixels is always 1](https://stackoverflow.com/questions/16768090/depth-buffer-got-by-glreadpixels-is-always-1) — Spektre, Jul 13 '18 at 05:43
see the [depth buffer got by glReadPixels is always 1](https://stackoverflow.com/a/51130948/2521214) especially function `glReadDepth` there on how to do this for perspective projection. — Spektre, Jul 13 '18 at 05:45
It's not exactly a duplicate because it's not clear what the problem is on the mentioned link. — strangelyput, Jul 17 '18 at 15:37

score 4 · Accepted Answer · answered Jul 13 '18 at 11:44

When you draw your geometry, your vertex shader is supposed to transform everything into normalized device coordinates (where each component is between -1 and 1) via the view/projection matrix. There is no way to avoid it, everything outside of this range will get clipped (or clamped, if you enable depth clamping). Then, these device coordinates are transformed into window coordinates - X and Y coordinates are mapped into range specified with glViewport and Z into range set with glDepthRange.

You can't disable normalization, because the final values are required to be in 0..1 range. But you can apply the reverse transformation: first, map your depth values back to -1..1 range (if you didn't use glDepthRange, all you have to do is multiply them by 2 and subtract 1). Then, you need to apply the inverse of your projection matrix - you can either do that explicitly by calculating its inverse, or avoid matrix operations by looking into how your perspective matrix is calculated. For a typical matrix, the inverse transform will be

zNorm = 2 * zBuffer - 1
zView = 2 * near * far / ((far - near) * zNorm - near - far)

(Note that zView will be negative, between -near and -far, because in OpenGL your Z axis normally points towards the camera).

Although normally you don't want only depth - you want the full 3D points, so you might as well reconstruct the vector in normalized coordinates and then apply the inverse projection/view transform.

Thanks. I was able to solve it by removing `glDepthRange(0.0, 1.0)` and the renormalization that I was doing after getting the depth from the buffer. I then obtained the 3D points directly by multiplying the points in the camera referential [x, y, 1] by the recovered depth. — strangelyput, Jul 17 '18 at 15:32

Rabbid76 · Answer 2 · 2018-07-15T15:34:46.357

After the projection to the viewport, the coordinates of the scene are normalized device coordinates (NDC). The normalized device space is a cube, with the left, bottom, front coordinate of (-1, -1, -1) and the right, top, back coordinate of (1, 1, 1). The geometry in this cube is "visible" on the viewport (unless it is covered).

The Z coordinate of the normalized device space, is mapped to the depth range (glDepthRange), which is general in [0, 1].

How the z-coordinate of the view space is transformed to a normalized device Z-coordinate and further a depth, depends on the projection matrix.
While at Orthographic Projection, the Z component is calculated by the linear function, at Perspective Projection, the Z component is calculated by the rational function.
See How to render depth linearly in modern OpenGL with gl_FragCoord.z in fragment shader?.

This means , to convert form the depth of the depth buffer to the original Z-coordinate, the projection (Orthographic or Perspective), and the near plane and far plane has to be known.

In the following is assumed that the depth range is in [0, 1] and depth is a value in this range:

Orthographic Projection

n = near, f = far

z_eye = depth * (f-n) + n;

z_linear = z_eye

Perspective Projection

n = near, f = far

z_ndc = 2 * depth - 1.0;
z_eye = 2 * n * f / (f + n - z_ndc * (f - n));

If the perspective projection matrix is known this can be done as follows:

A = prj_mat[2][2]
B = prj_mat[3][2]
z_eye = B / (A + z_ndc)

Note, in any case transformation by the inverse projection matrix, would transform a normalized device coordinate to a coordinate in view space.

glReadPixels() how to get actual depth instead of normalized values?

2 Answers2