6

How to estimate the surface normal of point I(i,j) on a depth image (pixel value in mm) without using Point Cloud Library(PCL)? I've gone through (1), (2), and (3) but I'm looking for a simple estimation of surface normal on each pixel with C++ standard library or openCV.

enter image description here

Community
  • 1
  • 1
askingtoomuch
  • 517
  • 7
  • 24

1 Answers1

6

You need to know the camera's intrinsic parameters, so that you can also know the distance between pixels in the same units (mm). This distance between pixels is obviously true for a certain distance from the camera (i.e. the value of the center pixel)

If the camera matrix is K which is typically something like:

    f  0  cx
K=  0  f  cy
    0  0   1

Then, taking a pixel coordinates (x,y), then a ray from the camera origin through the pixel (in camera world coordinate space) is defined using:

              x
P = inv(K) *  y
              1

Depending of whether the distance in your image is a projection on the Z axis, or just a euclidean distance from the center, you need to either normalize the vector P such that the magnitude is the distance to the pixel you want, or make sure the z component of P is this distance. For pixels around the center of the frame this should be close to identical.

If you do the same operation to nearby pixels (say, left and right) you get Pl and Pr in units of mm Then just find the norm of (Pl-Pr) which is twice the distance between adjacent pixels in mm.

Then, you calculate the gradient in X and Y

gx = (Pi+1,j - Pi-1,j) / (2*pixel_size)

Then, take the two gradients as direction vectors:

ax = atan(gx),  ay=atan(gy)


     | cos ax    0    sin ax |   |1|
dx = |    0      1       0   | * |0|
     | -sin ax   0    cos ax |   |0|

     |    1      0       0   |   |0|
dy = |    0   cos ay -sin ay | * |1|
     |    0   sin ay  cos ay |   |0|

N = cross(dx,dy);

You may need to see if the signs make sense, by looking at a certain gradient and seeing of the dx,dy point to the expected direction. You may need to use a negative for none/one/both angles and same for the N vector.

Photon
  • 3,182
  • 1
  • 15
  • 16
  • May I know how to get the distance between pixels given camera's intrinsic parameters: `fx: 365.40, fy: 365.40, cx: 260.93, cy: 205.60` ? – askingtoomuch Jun 23 '15 at 08:50
  • Added more details above. – Photon Jun 23 '15 at 09:35
  • Thanks for the details. I calculated P for the pixel circled above with coordinate (273,163) and I got `P = [0.0330 -0.1166 1]T`. What is this supposed to mean? The depth value above is the distance in mm from the camera. – askingtoomuch Jun 23 '15 at 10:38
  • multiply the vector by the distance, so the Z value has the distance. After that, the X,Y will be in mm at the object – Photon Jun 23 '15 at 13:35
  • according to your method, the normal is always pointing in Z direction. Is this correct? – askingtoomuch Jun 24 '15 at 06:04
  • If +Z is the direction of camera view, all normals should be to -Z, which makes sense because normals facing away you can't see (occluded), and ones that have Z=0 you will also not see as they face sideways. If you get the opposite Z compared to expected, it means you got a normal that goes into the object. Just multiply by -1 – Photon Jun 24 '15 at 06:34
  • normal pointing to Z direction means the surface is perpendicular to the view whereas it's not the case for the depth image – askingtoomuch Jun 24 '15 at 06:48
  • There are multiple ways to view 3D coordinate spaces. My analysis was given that the Z axis is the axis from the camera forward, and X,Y are in the image plane, as they often are in 2D image processing. The direction of whether it's +/- Z is a matter of making sure your X,Y,Z follow the right hand rule. – Photon Jun 24 '15 at 10:16
  • you are correct about the Z axis. But if I get the cross product of gx*(1,0,0) and gy*(0,1,0), the normal will be (0,0,Nz). Shouldn't it be N(Nx,Ny,Nz) instead? where `N = vector A X vector B, vector A = P_right - P_left, vector B = P_down - P_up` – askingtoomuch Jun 24 '15 at 11:42
  • You're correct. My mistake. The gradients should be used as the tangent of the angle of rotation of the pure x,y axes. I'll modify the description above – Photon Jun 24 '15 at 17:39
  • It's not true that all normals should be to -Z if +Z is the direction of the camera view. Take for example this sketch: http://i.imgur.com/nY8YmXr.png where the walls of the side of the room are slanted a bit away, but you can still see them. You can't see a surface if it's normal is facing away from the **ray** hitting it, but it's perfectly possible to see a surface that is facing away from the camera axis. – etarion Aug 25 '17 at 10:10
  • Can someone explain where the atan comes from? – ASML May 08 '18 at 01:58
  • This answer -> https://stackoverflow.com/questions/34644101/calculate-surface-normals-from-depth-image-using-neighboring-pixels-cross-produc directly uses gx and gy to get the gradient vectors (1,0,gx) and (0,1,gy), which are dx and dy in your answer. Is that an approximation of the solution you give above? – MonsieurBeilto Sep 04 '18 at 03:20
  • Your answer kind of makes sense to me. I have two doubts - 1) when calculating gx, why do you divide by pixel_size? I thought you should divide by 2mm. 2) Where did you use the depth value of the pixel? Is there a resource where I can study this method? – MonsieurBeilto Sep 06 '18 at 21:52
  • The division is by (2*pixel_size) which means you would need to calculate the span of what 2 pixels are worth at the distance in question. If I would write this answer today I would not only take 2 values for each gradient, but rather a set of pixels around the ray's center and calculate a best fit plane using least squares. – Photon Sep 08 '18 at 13:41