0

Given that I have:
* 3D object mesh (*.obj file)
* camera intrinsics
* camera/object pose

What I need is someway (in python) to get, for each pixel the corresponding 3D coordinates (if available) of the object model. Similarly to a depth map (https://en.wikipedia.org/wiki/Depth_map).

One could do this by looking for intersections between all triangles in the mesh and the line produced by each pixel and the projection matrices. And then get the intersections that are closer to the camera. But there must be faster and more elegant way. I've been looking at ray casting and the PyOpenGL framework but I am still not sure how to do it. Can anyone help?

  • [Octree](https://en.wikipedia.org/wiki/Octree) / [k-D tree](https://en.wikipedia.org/wiki/K-d_tree) / [BVH](https://en.wikipedia.org/wiki/Bounding_volume_hierarchy). Or even a simple grid. There are countless many ray-tracer demos and tutorials out there. – meowgoesthedog Jun 08 '18 at 12:05
  • 1
    If you need an entire image, just render the model to a depth map. The GPU is specialized for this kind of tasks. – Nico Schertler Jun 08 '18 at 13:39
  • Wrote [an answer](https://stackoverflow.com/a/48243123) for another question on how to get the depth buffer in pyopengl that might be helpfil. As for loading a 3d model, you could use a library like assimp to automagically read obj files, which has python bindings. – CodeSurgeon Jun 09 '18 at 19:50

1 Answers1

0

For future reference I was able to do what I intended by starting with the python package meshrender (https://pypi.org/project/meshrender/) which uses PyOpenGL. I then performed some modifications to its depth renderer, so that from the normalized depth I could then obtain the 3D points.