1

I am trying to transform screen space coordinates(2D) to world space(3D) for point cloud generation in python language. Given to me are projection matrix, view matrix, and a depth image. I am trying to follow these steps: Getting World Position from Depth Buffer Value.

So far, I have come up with this code:

import random
import numpy as np

origin = camera[:-1]

clipSpaceLocation =[]
m_points = []

# Matrix multipication of projection and then view and finally inverse of it
IViewProj = np.linalg.inv(proj @ view)
                          
for y in range(height):
    for x in range(width):        
                
        # 4x1
        # depth image with grayscale values from 0-255
        clipSpaceLocation = np.array([(x / width) * 2 - 1,
                                      (y / height) * 2 - 1,
                                       depth[y,x] * 2 - 1,
                                      1])

        # 4x4 @ 4x1 -> 4x1
        worldSpaceLocation = IViewProj @ clipSpaceLocation
        # perspective division
        worldSpaceLocation /= worldSpaceLocation[-1]
        worldSpaceV3 = worldSpaceLocation[:-1]
        m_points.append(worldSpaceV3)
                   
        
m_points = np.array(m_points)

m_points are [xyz] position, which I am eventually plotting on point cloud but it isn't giving the correct result, it's basically giving me the point cloud of depth image. Can anyone help me with this?

1 Answers1

1

I have figured out the solution. If anyone is looking for the answer in Python, this is the solution:

@staticmethod
def read_pc_from_layers(file_cam, file_depth, file_colour):
    origin, projection, view, = read_cam_file(file_cam)
    depth = imageio.imread(file_depth)
    colour = imageio.imread(file_colour)
    i_view_projection = np.linalg.inv(view @ projection)
    width = depth.shape[1]
    height = depth.shape[0]
    vertices = []
    colours = []
    point_cloud = PointCloud()
    for y in range(height):
        for x in range(width):
            #map to [0,1]
            d = depth[height - y - 1][x][0] / 255.0
            # check for valid values
            if 0.00001 < d < 0.99999999999:
                clip_space_location = np.array([(x / width) * 2 - 1, (y / height) * 2 - 1, d * 2 - 1, 1])
                world_space_location = clip_space_location @ i_view_projection
                world_space_location /= world_space_location[3]
                colours.append(colour[height - y - 1][x])
                vertices.append(world_space_location[0:3])
    point_cloud.vertices = np.asarray(vertices)
    point_cloud.colours_luminance = np.asarray(colours).astype(np.uint8)
    point_cloud.colours_labels = np.asarray(colours).astype(np.uint8)
    return point_cloud