What could be the reason for triangulation 3D points to result in a warped (paraboloid) plot? Trying to perform 3D reconstruction using SFM

Question

I am trying to do 3D reconstruction using SFM (Structure From Motion). I am pretty new to computer vision and doing this as a hobby, so if you use acronyms please also let me know what it stands for so I can look it up.

Learning wise, I have been following this information :

https://www.youtube.com/watch?v=SyB7Wg1e62A&list=PLgnQpQtFTOGRYjqjdZxTEQPZuFHQa7O7Y&ab_channel=CyrillStachniss
https://imkaywu.github.io/tutorials/sfm/#triangulation
Plus links below from quick question.

My end goal is to use this on persons face, to create a 3D face reconstruction. If people have advice on this topic specifically please let me know as well.

I do the following steps :

IO using OpenCV. A video taken using a single camera.
Find intrinsic parameters and distortion coefficients of the camera using Zhangs method.
Use SIFT to find features from frame 1 and frame 2.
Feature matching is done using cv2.FlannBasedMatcher().
Compute essential matrix using cv2.findEssentialMat().
Projection matrix of frame 1 is set to numpy.hstack((numpy.eye(3), numpy.zeros((3, 1))))
Rotation and Translation are obtained using cv2.recoverPose().
Using Rotation and Translation we get the Projection Matrix of frame 2 curr_proj_matrix = cv2.hconcat([curr_rotation_matrix, curr_translation_matrix]).
I use cv2.undistortPoints() on feature pts for frame 1 and 2, using information from step 2.
Lastly, I do triangulation points_4d = triangulation.triangulate(prev_projection_matrix, curr_proj_matrix, prev_pts_u, curr_pts_u)
Then I reassign prev values to be equal curr values and continue through the video.
I use matplotlib to display the scatter plot.

Quick Question :

Why do some articles do E = (K^-1)T * F * K and some E = (K)T * F * K.

First way : What do I do with the fundamental matrix?

Second way : https://harish-vnkt.github.io/blog/sfm/

Issue :

As you can see the scatter plot looks a bit warped, I am unsure why, or if I am missing a step, or doing something wrong. Hence looking for advise. Also the Z axis, is all negative.

One of the guesses I had, was that the video is in 60 FPS and even though I am moving the camera relatively quickly, it might not be enough of the rotation + translation to determine the triangulation. However, removing frames in between, did not make much difference.

Please let me know if you would like me to provide some of the code.

Are you working with undistorted images or undistorted feature positions? — Micka, Oct 03 '21 at 21:41
Its currently set up to use undistorted feature positions. Does one have advantage over the other? — , Oct 03 '21 at 21:48
Question 1: both versions look wrong (typos?). the prime symbol is part of the identifier, it is not an inverse or derivative or transpose. K and K' are the (possibly not identical) camera matrices of both views. go with wikipedia and original papers and proper books https://en.wikipedia.org/wiki/Fundamental_matrix_(computer_vision) — Christoph Rackwitz, Oct 03 '21 at 22:13
matplotlib will *NOT* scale the plot axes equally. it will look weird because it's autoscaling it all so the whole box is used. perhaps look into `plot.ly`. I would have tagged `matplotlib` but only 5 tags are allowed and I can't choose. — Christoph Rackwitz, Oct 03 '21 at 22:17
I think I figured out Question 1, thank you for the wiki page. I was under impression that the symbol K' was inverse matrix of K. But I think it means K of the second camera. Hence, its essential matrix = (transpose of K of second camera/frame) * (fundamental matrix) * (K). In my case they are the same. — , Oct 03 '21 at 22:20

score 1 · Answer 1 · 2021-10-04T01:55:36.000

1

I believe I have an answer but I am not sure why it works. Hence if someone could expand, plus mention what the 3rd column of the 4D points is, then I will approve that answer and delete this.

Doing this on 4D points after triangulation : points_4d /= points_4d[3] (1)

The documentation does not mention it : https://docs.opencv.org/4.5.3/d9/d0c/group__calib3d.html#gad3fc9a0c82b08df034234979960b778c

My best guess, is that doing (1) is similar to doing this : cv2.convertPointsFromHomogeneous(). Converting from homogeneous space to euclidean space.

Edit 20211003 : Please see a comment for further explanation.

edited Oct 04 '21 at 01:55

answered Oct 03 '21 at 23:39

2

good guess, yes that's it. they're just 3D points in a 4D projective space, analogous to 2D points in a 3D projective space. all points (x,y,z,1) * w, for arbitrary nonzero w, in the projective space represent the same 3D point (x,y,z), and (x,y,z,1) is the canonical representative. the additional dimension makes translations possible (among other things). I don't know why those points would be off the 4D plane; I don't understand the SfM algorithms that well. in the case of homographies (and 3D-to-2D projections), they're expected to be, and require that division step. – Christoph Rackwitz Oct 04 '21 at 00:17

What could be the reason for triangulation 3D points to result in a warped (paraboloid) plot? Trying to perform 3D reconstruction using SFM

1 Answers1

Linked