Find my camera's 3D position and orientation according to a 2D marker

Question

I am currently building an Augmented Reality application and stuck on a problem that seem quite easy but is very hard to me ... The problem is as follow:

My device's camera is calibrated and detect a 2D marker (such as a QRCode). I know the focal length, the sensor's position, the distance between my camera and the center of the marker, the real size of the marker and the coordinates of the 4 corners of the marker and of it center on the 2D image I got from the camera. See the following image:

On the image, we know the a,b,c,d distances and the coordinates of the red dots.

What I need to know is the position and the orientation of the camera according to the marker (as represented on the image, the origin is the center of the marker).

Is there an easy and fast way to do so? I tried some method imagined by myself (using Al-Kashi's formulas), but this ended with too much errors :(. Could someone point out a way to get me out of this?

This is usually solved by the so-called "PnP" algorithm (see e.g. [this article](http://cvlabwww.epfl.ch/~lepetit/papers/lepetit_ijcv08.pdf)). What programming language / library are you using ? — BConic, Jan 14 '15 at 16:43
I knew about this solution, but that is way too complex for me, my knowledge are not good enough to fully understand it. I'm using C and drawing with OpenGL (and nothing more). I have a fully working AR app which I'm evolving in order to use markers (we were using a lib called Vuforia before that). The only thing left is to get that damn camera position :( Is there an easier way? — Cocottier, Jan 14 '15 at 19:45
This is not a simple problem, I don't think there is an easier way... But maybe you can find some code or lightweight library to use. — BConic, Jan 14 '15 at 19:59
I'll look for that (kind of already did, but usualy the functions are not that much commented so they still are hard to understand, and use something I don't understand how it works is something I dislike a bit). I might find one that help me to understand better the problem! — Cocottier, Jan 15 '15 at 08:33

BConic · Accepted Answer · 2015-01-15T08:22:21.507

2

You can find some example code for the EPnP algorithm on this webpage. This code consists in one header file and one source file, plus one file for the usage example, so this shouldn't be too hard to include in your code.

Note that this code is released for research/evaluation purposes only, as mentioned on this page.

EDIT:

I just realized that this code needs OpenCV to work. By the way, although this would add a pretty big dependency to your project, the current version of OpenCV has a builtin function called solvePnP, which does what you want.

edited Jan 15 '15 at 08:22

answered Jan 15 '15 at 08:15

BConic

8,750
2
29
55

I'll look for both of them and keep you updated, thanks you for your help! – Cocottier Jan 15 '15 at 08:38
So I was looking at the solvePnP function from OpenCV, but none of the method called in the function are accessible (PnP(), P3Psolver(), ...), nor referenced in a header (so no external library). How is that possible? O_o – Cocottier Jan 15 '15 at 10:40
These are internal to the library. You should be using the `solvePnP` function directly. [There](http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#solvepnp) is some documentation on this function. FYI, PnP and P3PSolver are functor instances of types epnp and p3p (see lines 64 & 76 of "solvepnp.cpp"), which are defined in p3p.h/cpp and epnp.h/cpp. – BConic Jan 15 '15 at 10:44
Okay, so I'm exploring a bit more how to solve my problem thanks to you and OpenCV, but I'm stuck by this line: https://github.com/Itseez/opencv/blob/5f590ebed084a5002c9013e11c519dcb139d47e9/modules/core/src/lapack.cpp#L1764 What is its effect? I though it called the constructor, which I don't understand either: https://github.com/Itseez/opencv/blob/02f4f2f96db1ee6bff5e35af240ab9eaf42411b0/modules/core/include/opencv2/core/operations.hpp#L288 Any idea? – Cocottier Jan 15 '15 at 13:19
Again, this is a functor: `svd` is an instance of the class `cv::SVD`, which overloads the `operator()`. Hence, doing `svd(...)` is like a function call. See [this post](http://stackoverflow.com/questions/356950/c-functors-and-their-uses) for more details on functors. – BConic Jan 15 '15 at 14:43
I got it, but what this function does? I mean, in the exemple you gave me, we created a new instance of the functor (thats what we do with `cv::SVD`, and then we use that functor `svd(...)`), but here the function behind the functor doesn't appear: `operator ()(m, flags);`. Does it do nothing at all? I mean, the body of the functor is replaced by a `;` ... I understand easily the behavior of the one in the example `int operator()(int y) { return x + y; }`. But using this example, the one with SVD seems clearly to do nothing, I must be wrong, because there would be no point calling nothing ... – Cocottier Jan 15 '15 at 15:04
As `cv::SVD` is a class, the body of the `operator()` can be defined elsewhere, like any other member function. You usually declare member functions in header files and define their body in cpp files, the same is true for operators. In this particular case, `cv::SVD::operator()` is defined at line 1614 of `lapack.cpp`. This computes the SVD (i.e. [Singular Value Decomposition](http://en.wikipedia.org/wiki/Singular_value_decomposition)) of matrix `a`, which is then used in the remainder of the function, via `svd.u` `svd.vt` and `svd.w`. – BConic Jan 15 '15 at 15:34
Since I can't understand how those things works, I will just use OpenCV to solve my problem. Thanks a lot for your help. I might look at it again somedays, but currently that suits me. Thanks again for teaching me what functors were :) – Cocottier Jan 19 '15 at 15:30

Dima · Answer 2 · 2017-01-24T16:22:17.860

2

You can compute the homography between the image points and the corresponding world points. Then from the homography you can compute the rotation and translation mapping a point from the marker's coordinate system into the camera's coordinate system. The math is described in the paper on camera calibration by Zhang.

Here's an example in MATLAB using the Computer Vision System Toolbox, which does most of what you need. It is using the extrinsics function, which computes a 3D rotation and a translation from matching image and world points. The points need not come from a checkerboard.

edited Jan 24 '17 at 16:22

answered Jan 15 '15 at 15:13

Dima

38,860
14
75
115

Hello, thanks for pointing this method out. Since I have way fewer points than a chessboard, do you think I could still use this method? If yes, is it possible to find the source of the "extrinsics" function? – Cocottier Jan 15 '15 at 15:30
1

@user2342594, You need at least 4 non-colinear points to compute the homography, which is a 2d projective transformation. The math for computing R and t from known intrinsic matrix and a homography is given on page 6 of the paper. It is rather simple to code. If you have a recent version of MATLAB with the Computer Vision System Toolbox, you can simply look at the source for extrinsics(). – Dima Jan 15 '15 at 15:54

Find my camera's 3D position and orientation according to a 2D marker

2 Answers2