finding the real world coordinates of an image point

Question

I am searching lots of resources on internet for many days but i couldnt solve the problem.

I have a project in which i am supposed to detect the position of a circular object on a plane. Since on a plane, all i need is x and y position (not z) For this purpose i have chosen to go with image processing. The camera(single view, not stereo) position and orientation is fixed with respect to a reference coordinate system on the plane and are known

I have detected the image pixel coordinates of the centers of circles by using opencv. All i need is now to convert the coord. to real world.

http://www.packtpub.com/article/opencv-estimating-projective-relations-images in this site and other sites as well, an homographic transformation is named as:

p = C[R|T]P; where P is real world coordinates and p is the pixel coord(in homographic coord). C is the camera matrix representing the intrinsic parameters, R is rotation matrix and T is the translational matrix. I have followed a tutorial on calibrating the camera on opencv(applied the cameraCalibration source file), i have 9 fine chessbordimages, and as an output i have the intrinsic camera matrix, and translational and rotational params of each of the image.

I have the 3x3 intrinsic camera matrix(focal lengths , and center pixels), and an 3x4 extrinsic matrix [R|T], in which R is the left 3x3 and T is the rigth 3x1. According to p = C[R|T]P formula, i assume that by multiplying these parameter matrices to the P(world) we get p(pixel). But what i need is to project the p(pixel) coord to P(world coordinates) on the ground plane.

I am studying electrical and electronics engineering. I did not take image processing or advanced linear algebra classes. As I remember from linear algebra course we can manipulate a transformation as P=[R|T]-1*C-1*p. However this is in euclidian coord system. I dont know such a thing is possible in hompographic. moreover 3x4 [R|T] Vector is not invertible. Moreover i dont know it is the correct way to go.

Intrinsic and extrinsic parameters are know, All i need is the real world project coordinate on the ground plane. Since point is on a plane, coordinates will be 2 dimensions(depth is not important, as an argument opposed single view geometry).Camera is fixed(position,orientation).How should i find real world coordinate of the point on an image captured by a camera(single view)?

EDIT I have been reading "learning opencv" from Gary Bradski & Adrian Kaehler. On page 386 under Calibration->Homography section it is written: q = sMWQ where M is camera intrinsic matrix, W is 3x4 [R|T], S is an "up to" scale factor i assume related with homography concept, i dont know clearly.q is pixel cooord and Q is real coord. It is said in order to get real world coordinate(on the chessboard plane) of the coord of an object detected on image plane; Z=0 then also third column in W=0(axis rotation i assume), trimming these unnecessary parts; W is an 3x3 matrix. H=MW is an 3x3 homography matrix.Now we can invert homography matrix and left multiply with q to get Q=[X Y 1], where Z coord was trimmed.

I applied the mentioned algorithm. and I got some results that can not be in between the image corners(the image plane was parallel to the camera plane just in front of ~30 cm the camera, and i got results like 3000)(chessboard square sizes were entered in milimeters, so i assume outputted real world coordinates are again in milimeters). Anyway i am still trying stuff. By the way the results are previosuly very very large, but i divide all values in Q by third component of the Q to get (X,Y,1)

FINAL EDIT

I could not accomplish camera calibration methods. Anyway, I should have started with perspective projection and transform. This way i made very well estimations with a perspective transform between image plane and physical plane(having generated the transform by 4 pairs of corresponding coplanar points on the both planes). Then simply applied the transform on the image pixel points.

Maybe this other thread could help http://stackoverflow.com/questions/7836134/get-3d-coord-from-2d-image-pixel-if-we-know-extrinsic-and-intrinsic-parameters Good luck! — blacatus, Nov 29 '14 at 17:16
I have same problem, can you tell me please finally what you did to find world coordinator ? i using this formula P=C^-1 * R^-1 * (p-t), but i get big number as you mentioned . — Rashed DIP, Aug 09 '18 at 13:51

score 9 · Answer 1 · edited Mar 14 '18 at 10:10

9

You said "i have the intrinsic camera matrix, and translational and rotational params of each of the image.” but these are translation and rotation from your camera to your chessboard. These have nothing to do with your circle. However if you really have translation and rotation matrices then getting 3D point is really easy.

Apply the inverse intrinsic matrix to your screen points in homogeneous notation: C^-1*[u, v, 1], where u=col-w/2 and v=h/2-row, where col, row are image column and row and w, h are image width and height. As a result you will obtain 3d point with so-called camera normalized coordinates p = [x, y, z]^T. All you need to do now is to subtract the translation and apply a transposed rotation: P=R^T(p-T). The order of operations is inverse to the original that was rotate and then translate; note that transposed rotation does the inverse operation to original rotation but is much faster to calculate than R^-1.

edited Mar 14 '18 at 10:10

codekaizen

26,990
7
84
140

answered Mar 14 '14 at 05:55

Vlad

4,425
1
30
39

Thank you Vlad for your patience in reading my long question and answering it. I have found something on learning opencv, which is edited in my original post. I will also try out subtracting translation and applying inverse rotation. By the way, I meant i am trying to get circles coordinates wrt chessboard origin. I am supposing with these matrix and vectors, I can get the real coordinates of detected circles of image(by some image processing, i have center pixel coordinates of circles) on chessboard plane. Am i wrong? – user3417020 Mar 14 '14 at 17:52
It will be useful to see a picture of your set-up or a graphical diagram. I am not sure how would you learn circles' positions and rotations w.r.t chess board origin unless your measured their physical placement relative to chess board with a ruler. Or may be is circles are printed on the chess board and you know exactly where. By the way, your real world system is centred on camera, right? – Vlad Mar 14 '14 at 18:17
the following link is the setup of the project.On the left there is the top view of "project setup" and on the right "right view". You can see that camera is at the center of the ground plane. The height and angle of the camera are fixed. The red circles on the ground plane are processed and pixel coordinates are found. My intention is for calibrating camera take say 15 images differently and a last image with this setup chessboard lying on the ground without circles. This way i can use the rotation and translation vectors of the last image. http://postimg.org/image/cclyvn8tv/ – user3417020 Mar 14 '14 at 18:37
i will transform the pixel coordinates to physical coordinates on the ground plane according to Q(physicalPointOnGroundPlane)=H.inv()*q(imagepoint) or according to your commments as Q=R.t()*(C.inv()*q-T). And this way i want to find the physical coordinates of the center of the circle, not the origin of the chessboard. I assume chessboard is just for finding intrinsic params and extrinsic params(when chessbrd is laid on ground). Otherwise what is the use for chessboard if i cannot transform any arbitrary pixel on the imageplane to the physical object(ground) plane? – user3417020 Mar 14 '14 at 18:58
I see, I did not realize everything lies on one plane. In this case, by the way, you can do everything with a single camera and accelerometer (as in cell phone set up). If you know camera height z, in one dimension the distance dist to the ground will be dist = z/cos(betta), where betta for each ray can come from accelerometer. – Vlad Mar 14 '14 at 19:43
thank you for your comments, i couldnt get the accelerometer method. However i solved my problem by the concept of perspective projection between physical and image planes. Now i can get the physical coordinate of each pixel on the image, with accurate estimation. – user3417020 Mar 15 '14 at 16:48
Can you please send me code or useful link that you used finally? Mr @user3417020 – Rashed DIP Aug 09 '18 at 14:00

finding the real world coordinates of an image point

1 Answers1