I'm trying to find the focal length, position and orientation of a camera in world space.
Because I need this to be resolution-independent, I normalized my image coordinates to be in the range [-1, 1]
for x
, and a somewhat smaller range for y
(depending on aspect ratio). So (0, 0)
is the center of the image. I've already corrected for lens distortion (using k1
and k2
coefficients), so this does not enter the picture, except sometimes throwing x
or y
slightly out of the [-1, 1]
range.
As a given, I have a planar, fixed rectangle in world space of known dimensions (in millimeters). The four corners of the rectangle are guaranteed to be visible, and are manually marked in the image. For example:
std::vector<cv::Point3f> worldPoints = {
cv::Point3f(0, 0, 0),
cv::Point3f(2000, 0, 0),
cv::Point3f(0, 3000, 0),
cv::Point3f(2000, 3000, 0),
};
std::vector<cv::Point2f> imagePoints = {
cv::Point2f(-0.958707, -0.219624),
cv::Point2f(-1.22234, 0.577061),
cv::Point2f(0.0837469, -0.1783),
cv::Point2f(0.205473, 0.428184),
};
Effectively, the equation I think I'm trying to solve is (see the equivalent in the OpenCV documentation):
/ xi \ / fx 0 \ / tx \ / Xi \
s | yi | = | fy 0 | | Rxyz ty | | Yi |
\ 1 / \ 1 / \ tz / | Zi |
\ 1 /
where:
i
is1, 2, 3, 4
xi, yi
is the location of pointi
in the image (between -1 and 1)fx, fy
are the focal lengths of the camera in x and y directionRxyz
is the 3x3 rotation matrix of the camera (has only 3 degrees of freedom)tx, ty, tz
is the translation of the cameraXi, Yi, Zi
is the location of pointi
in world space (millimeters)
So I have 8 equations (4 points of 2 coordinates each), and I have 8 unknowns (fx, fy
, Rxyz
, tx, ty, tz
). Therefore, I conclude (barring pathological cases) that a unique solution must exist.
However, I can't seem to figure out how to compute this solution using OpenCV.
I have looked at the imgproc module:
getPerspectiveTransform
works, but gives me a 3x3 matrix only (from 2D points to 2D points). If I could somehow extract the needed parameters from this matrix, that would be great.
I have also looked at the calib3d module, which contains a few promising functions that do almost, but not quite, what I need:
initCameraMatrix2D
sounds almost perfect, but when I pass it my four points like this:cv::Mat cameraMatrix = cv::initCameraMatrix2D( std::vector<std::vector<cv::Point3f>>({worldPoints}), std::vector<std::vector<cv::Point2f>>({imagePoints}), cv::Size2f(2, 2), -1);
it returns me a camera matrix that has
fx, fy
set to-inf, inf
.calibrateCamera
seems to use a complicated solver to deal with overdetermined systems and outliers. I tried it anyway, but all I can get from it are assertion failures like this:OpenCV(3.4.1) Error: Assertion failed (0 <= i && i < (int)vv.size()) in getMat_, file /build/opencv/src/opencv-3.4.1/modules/core/src/matrix_wrap.cpp, line 79
Is there a way to entice OpenCV to do what I need? And if not, how could I do it by hand?