0

I am trying to find a 3D model from 2 images taken from the same camera using OpenCV with C++. I followed this method. I am still not able to rectify mistake in R and T computation.

Image 1: With Background Removed for eliminating mismatches

img

Image 2: Translated only in X direction wrt Image 1 With Background Removed for eliminating mismatches

img

I have found the Intrinsic Camera Matrix (K) using MATLAB Toolbox. I found it to be :

K=

[3058.8    0    -500

0       3057.3   488

0       0         1]

All image matching keypoints (using SIFT and BruteForce Matching, Mismatches Eliminated) were aligned wrt center of image as follows:

obj_points.push_back(Point2f(keypoints1[symMatches[i].queryIdx].pt.x - image1.cols / 2, -1 * (keypoints1[symMatches[i].queryIdx].pt.y - image1.rows / 2)));
scene_points.push_back(Point2f(keypoints2[symMatches[i].trainIdx].pt.x - image1.cols / 2, -1 * (keypoints2[symMatches[i].trainIdx].pt.y - image1.rows / 2)));

From Point Correspondeces, I found out Fundamental Matrix Using RANSAC in OpenCV

Fundamental Matrix:

[0             0      -0.0014
 
0              0       0.0028

0.00149    -0.00572      1  ]

Essential Matrix obtained using:

E = (camera_Intrinsic.t())*f*camera_Intrinsic;

E obtained:

[ 0.0094        36.290       1.507

-37.2245        -0.6073      14.71

-1.3578         -23.545     -0.442]

SVD of E:

E.convertTo(E, CV_32F);
Mat W = (Mat_<float>(3, 3) << 0, -1, 0, 1, 0, 0, 0, 0, 1);
Mat Z = (Mat_<float>(3, 3) << 0, 1, 0, -1, 0, 0, 0, 0, 0);

SVD decomp = SVD(E);
Mat U = decomp.u;
Mat Lambda = decomp.w;
Mat Vt = decomp.vt;  

New Essential Matrix for epipolar constraint:

Mat diag = (Mat_<float>(3, 3) << 1, 0, 0, 0, 1, 0, 0, 0, 0);
Mat new_E = U*diag*Vt;

SVD new_decomp = SVD(new_E);

Mat new_U = new_decomp.u;
Mat new_Lambda = new_decomp.w;
Mat new_Vt = new_decomp.vt;

Rotation from SVD:

Mat R1 = new_U*W*new_Vt;
Mat R2 = new_U*W.t()*new_Vt;

Translation from SVD:

Mat T1 = (Mat_<float>(3, 1) << new_U.at<float>(0, 2), new_U.at<float>(1, 2), new_U.at<float>(2, 2));
Mat T2 = -1 * T1;

I was getting the R matrices to be :

R1:

[ -0.58      -0.042     0.813
-0.020       -0.9975    -0.066
 0.81        -0.054     0.578]

R2:

[  0.98     0.0002      0.81
  -0.02      -0.99      -0.066
  0.81      -0.054      0.57 ]

Translation Matrices:

T1:

[0.543

-0.030

0.838]

T2:

[-0.543

0.03

-0.83]

Please clarify wherever there is a mistake.

This 4 sets of P2 matrix R|T with P1=[I] are giving incorrect triangulated models.

Also, I think the T matrix obtained is incorrect, as it was supposed to be only x shift and no z shift.

When tried with same image1=image2 -> I got T=[0,0,1]. What is the meaning of Tz=1? (where there is no z shift as both images are same)

And should I be aligning my keypoint coordinates with image center, or with principle focus obtained from calibration?

Community
  • 1
  • 1
Spyder
  • 1
  • 1
  • Are you aware of the fact that the two images are planer which means that the result will be a simple plane? – Humam Helfawi Oct 08 '17 at 18:54
  • @HumamHelfawi Yes I am aware (I am in primary testing stage -simple translation, rotation of simple shapes ), The result will be planar - It is expected that depth values will be same for all points, but this wasn't the outcome of actual triangulation. – Spyder Oct 09 '17 at 05:31
  • I think the Essential matrix needs to be estimated from points in general configuration. The translation will then be estimated up to a scale factor. You better check the literature, for instance: [Relationships between the Homography and the Essential Matrix](https://cseweb.ucsd.edu/classes/sp04/cse252b/notes/lec05/lec5.pdf) – Catree Oct 09 '17 at 13:19

0 Answers0