Get depth from camera for each pixel

Question

I have a mesh model and, using VTK, have rendered a view of it from a given camera position (x,y,z). I can save this to an RGB image (640x480) but I also want to save a depth map where each pixel stores the value of the depth from the camera.

I have tried using the Zbuffer values given by the render window by following this example. The problem is that the Zbufer only stores values in the range [0,1]. Instead I am trying to create synthetic range image, where I store the depth/distance of each pixel from the camera. Analogous to the image produced by the Kinect, I am trying to create one from a specific viewpoint of a mesh model.

EDIT - adding some code

My current code:

Load the mesh

string mesh_filename = "mesh.ply";
    vtkSmartPointer<vtkPLYReader> mesh_reader = read_mesh_ply(mesh_filename);

    vtkSmartPointer<vtkPolyDataMapper> mapper = vtkSmartPointer<vtkPolyDataMapper>::New();
    mapper->SetInputConnection(mesh_reader->GetOutputPort());

    vtkSmartPointer<vtkActor> actor = vtkSmartPointer<vtkActor>::New();
    actor->SetMapper(mapper);

    vtkSmartPointer<vtkRenderer> renderer =  vtkSmartPointer<vtkRenderer>::New();
    vtkSmartPointer<vtkRenderWindow> renderWindow = vtkSmartPointer<vtkRenderWindow>::New();
    renderWindow->AddRenderer(renderer);
    renderWindow->SetSize(640, 480);

    vtkSmartPointer<vtkRenderWindowInteractor> renderWindowInteractor = vtkSmartPointer<vtkRenderWindowInteractor>::New();
    renderWindowInteractor->SetRenderWindow(renderWindow);

    //Add the actors to the scene
    renderer->AddActor(actor);
    renderer->SetBackground(1, 1, 1);

Create a camera and place it somewhere

vtkSmartPointer<vtkCamera> camera = vtkSmartPointer<vtkCamera>::New();

    renderer->SetActiveCamera(camera);
    camera->SetPosition(0,0,650);
    //Render and interact
    renderWindow->Render();

Get result from the z buffer

double b = renderer->GetZ(320, 240);

In this example, this gives 0.999995. As the values are between [0,1] I don't know how to interpret this, as you can see I have set the camera to be 650 units away on the z-axis so I assume the z distance at this pixel (which is on the object in the rendered RGB) should be near to 650.

Did you read this? http://www.vtk.org/Wiki/VTK/Examples/Cxx/Utilities/ZBuffer — Amadeus, Jul 15 '13 at 17:05
@TomásBadan Hi, yes I have read this example. The problem is the zbufer only stores values in the range [0,1]. Instead I am trying to create synthetic range image, where I can get the depth/distance of each pixel from the camera. (editing question with this comment) — Aly, Jul 15 '13 at 17:13
In openGL, z buffer are given in unitary values, where 1 means as far as possible and 0 means as near as possible. Are you sure that this buffer isn't unitary too? — Amadeus, Jul 15 '13 at 18:56
@TomásBadan yes, that is the problem. Is there a way to convert this number to a "real" depth, i.e. in the units of the model? — Aly, Jul 16 '13 at 08:24
As Tomas Badan say, 1 mean as far as possible and 0 as near as possible. If your camera near is 0 and far is 650, a z value of 1 mean 650. Real distance = zValue * (far-near)+near. In your case: Real distance = zValue*650 — Adrian Maire, Jul 16 '13 at 08:28
@AdrianMaire I see, How can I find the actual near and far values for my camera/how can I set them? — Aly, Jul 16 '13 at 08:33
When you set up openGL, you probably have a line like glFrustum(left, right, bottom, top, near, far). Or maybe you have used the glu equivalent? gluPerspective()? — Adrian Maire, Jul 16 '13 at 08:42
@AdrianMaire I am not making these calls, all my code is shown above. I am using VTK, which is probably setting the clipping planes at some point — Aly, Jul 16 '13 at 08:57
@Ali I have not knowleadge about VTK, but maybe following function may help you: vtkCamera::GetFrustumPlanes(double aspect, double planes[24]) — Adrian Maire, Jul 16 '13 at 09:05
The formula Adrian posted is only correct for orthographic projections. For perspective projections, the z-buffer isn't linear. My guess is that that's what the SetInputBufferTypeToZBuffer() method is supposed to take care of. — Andreas Haferburg, Jul 16 '13 at 22:13
Hi, Did you actually managed to solve this ? I am at the same problem right now — Kev1n91, Sep 28 '17 at 17:29
Same problem here. Any solution other than rewriting the entire rendering pipeline ? — Matthieu G, Jun 25 '18 at 12:09

Mr.Epic Fail · Answer 1 · 2020-07-08T10:50:30.263

This python snippet illustrates how to convert z buffer values to real distances. The non-linear mapping is defined as follows:

numerator = 2.0 * z_near * z_far
denominator = z_far + z_near - (2.0 * z_buffer_data_numpy - 1.0) * (z_far - z_near)
depth_buffer_data_numpy = numerator / denominator

Here, a full example:

import vtk
import numpy as np
from vtk.util import numpy_support
import matplotlib.pyplot as plt

vtk_renderer = vtk.vtkRenderer()
vtk_render_window = vtk.vtkRenderWindow()
vtk_render_window.AddRenderer(vtk_renderer)
vtk_render_window_interactor = vtk.vtkRenderWindowInteractor()
vtk_render_window_interactor.SetRenderWindow(vtk_render_window)
vtk_render_window_interactor.Initialize()

source = vtk.vtkCubeSource()
mapper = vtk.vtkPolyDataMapper()
mapper.SetInputConnection(source.GetOutputPort())
actor = vtk.vtkActor()
actor.SetMapper(mapper)
actor.RotateX(60.0)
actor.RotateY(-35.0)
vtk_renderer.AddActor(actor)

vtk_render_window.Render()
active_vtk_camera = vtk_renderer.GetActiveCamera()
z_near, z_far = active_vtk_camera.GetClippingRange()

z_buffer_data = vtk.vtkFloatArray()
width, height = vtk_render_window.GetSize()
vtk_render_window.GetZbufferData(
    0, 0, width - 1, height - 1, z_buffer_data)
z_buffer_data_numpy = numpy_support.vtk_to_numpy(z_buffer_data)
z_buffer_data_numpy = np.reshape(z_buffer_data_numpy, (-1, width))
z_buffer_data_numpy = np.flipud(z_buffer_data_numpy)  # flipping along the first axis (y)

numerator = 2.0 * z_near * z_far
denominator = z_far + z_near - (2.0 * z_buffer_data_numpy - 1.0) * (z_far - z_near)
depth_buffer_data_numpy = numerator / denominator
non_depth_data_value = np.nan
depth_buffer_data_numpy[z_buffer_data_numpy == 1.0] = non_depth_data_value

print(np.nanmin(depth_buffer_data_numpy))
print(np.nanmax(depth_buffer_data_numpy))

plt.imshow(np.asarray(depth_buffer_data_numpy))
plt.show()

Side note: On my system, a few times the imshow command did not display anything. Re-running the script did solve that issue.

Sources:

http://web.archive.org
open3d

Very nice, I would also suggest using the vtkWindowToImageFilter to get the ZBuffer instead of getting it directly from the render window — Juan, Feb 19 '21 at 00:01

Get depth from camera for each pixel

1 Answers1