4

My problem is quite simple yet I struggle to solve it correctly.

I have a camera looking towards the ground and I know all the parameters of the shot. So, using some maths I was able to compute the 4 points defining the field of view of the camera (the coordinates on the ground of each image's corners).

Now, from the coordinates (x, y) of a pixel of the image, I would like to know its real coordinates projected on the ground.

I thought that homography was the way to go, but I read here and there that "homography maps a plane seen from a camera to the same plane seen from another" which is a slightly different problem.

What should I use, please?


Edit: Here is an example.

Given this image: Ground

I know everything about the camera that took the picture (height, angles of view, orientation), so I could calculate the coordinates of the four corners forming its field of view on the ground, for example (in centimeters, relative to the camera position, clockwise from top-left): (-300, 500), (300, 500), (100, 50), (-100, 50).

Knowing that the coordinates on the image of the blade of grass are (1750, 480), how can I know its actual coordinates on the ground?

Community
  • 1
  • 1
Delgan
  • 18,571
  • 11
  • 90
  • 141
  • Is it possible to see a sample image? It seems that homography can still be useful because the plane is always the same, the ground. – UJIN Mar 08 '17 at 15:14
  • @UJIN I updated my question with an example. – Delgan Mar 08 '17 at 15:34
  • It should be possible to find the coordinates of the grass in the "projected" plane, but I can't wrap my head around it now. I will try as soon as I have some spare time. I mean, you have 4 points in one coordinate system, and 4 points in another. It should be possible to use the 8-point algorithm, find a homography, and then project points from the photo to the actual ground. But at this point I may be all wrong :/ – UJIN Mar 08 '17 at 17:34
  • @UJIN Thank you for your time! I think I will go with an simple homography then. – Delgan Mar 08 '17 at 19:10

1 Answers1

3

By "knowing everything" about the camera, do you mean you have the camera FOV, rotation and translation with respect to the ground plane? Then it's trivial, right?

Write the camera matrix K = [[f, 0, w/2],[0, f, h/2],[0, 0, 1]]. Let R and t be respectively the 3x3 rotation matrix and 3x1 translation from camera to ground. A point on the ray going through a given pixel p=[u, v, 1] has camera coordinates r = inv(K) * p. Express it in world coordinates as R * r + t, intersect with the ground plane and you are done.

Francesco Callari
  • 11,300
  • 2
  • 25
  • 40
  • @Francesco Callari Could you elaborate on why you subtract inv(Q)[0 0 0 1]? – Luca Nov 14 '18 at 14:11
  • @Delgan I am not sure if you understood why we subtract with `inv(Q)[0,0,0,1]`? I understand that the ray will pass through `inv(Q)[r, 1]` and `inv(Q)[0,0,0,1]` but I am not sure why these quantities are subtracted. Would really appreciate some clarification on this as I am really struggling with a similar issue. – Luca Nov 14 '18 at 16:28
  • @Delgan Also how do you estimate the equation of the ground plane? – Luca Nov 14 '18 at 16:44
  • Hey @Luca! I'm sorry, I would like to help you but actually, I never take the time to properly implement the @Francesco's solution... :/ In my case, the ground plane equation was simply `z = 0`. – Delgan Nov 14 '18 at 20:18