I am trying to transform screen space coordinates(2D) to world space(3D) for point cloud generation in python language. Given to me are projection matrix, view matrix, and a depth image. I am trying to follow these steps: Getting World Position from Depth Buffer Value.
So far, I have come up with this code:
import random
import numpy as np
origin = camera[:-1]
clipSpaceLocation =[]
m_points = []
# Matrix multipication of projection and then view and finally inverse of it
IViewProj = np.linalg.inv(proj @ view)
for y in range(height):
for x in range(width):
# 4x1
# depth image with grayscale values from 0-255
clipSpaceLocation = np.array([(x / width) * 2 - 1,
(y / height) * 2 - 1,
depth[y,x] * 2 - 1,
1])
# 4x4 @ 4x1 -> 4x1
worldSpaceLocation = IViewProj @ clipSpaceLocation
# perspective division
worldSpaceLocation /= worldSpaceLocation[-1]
worldSpaceV3 = worldSpaceLocation[:-1]
m_points.append(worldSpaceV3)
m_points = np.array(m_points)
m_points are [xyz] position, which I am eventually plotting on point cloud but it isn't giving the correct result, it's basically giving me the point cloud of depth image. Can anyone help me with this?