I just started to study about Multiview stereo vision.
But I cannot understand disparity and depth (disparity map and depth map either).
Could you give me the intuition?
Thanks.
I just started to study about Multiview stereo vision.
But I cannot understand disparity and depth (disparity map and depth map either).
Could you give me the intuition?
Thanks.
In stereo vision, the two images, captured by the two cameras separated by a distance, can be used to get the 3d location (x, y, z) of the points of images in real world i.e. the depth -- z location in addition to 2d -- x and y location.
The disparity is the difference in image location of the same 3D point when projected under perspective to two different cameras.
Any point in the scene that is visible in both cameras will be projected to a pair of image points in the two images, called a conjugate pair. The displacement between the positions of the two points is called the disparity.
Read more here.
The disparity map/image is simply the image that given where each pixel gives the disparity of that 3d point.
The depth (the actual z location of 3d point) can be calculated by using the disparity of the corresponding point e.g. in simple cases, as follows:
depth = (baseline * focal length) / disparity)
where baseline
is the distance b/w the cameras.
By getting the depth of every pixel, you get the depth map/image.
Read more here.
Disparity is the horizontal displacement of a point's projections between the left and the right image. Whereas, depth refers to the z coordinate (usually z) of a point located in the real 3D world (x, y, z).
It is also important to note that, given the disparity, one can calculate the corresponding depth given the intrinsics of the camera (b,f) which was used to take the images which were used to calculate the disparity.
I attached some of my notes below, hopefully it would shed some insight to the process. My notes on depth calculation from disparity