21

I have a depth image, that I've generated using 3D CAD data. This depth image can also be taken from a depth imaging sensor such as Microsoft Kinect or any other stereo camera. So basically it is a depth map of points visible in the imaging view. In other words it is segmented point cloud of an object from a certain view.

I would like to determine (estimating will also do) the surface normals of each point, then find tangent plane of that point.

How can I do this? I've did some research and found some techniques but didn't understand them well (I could not implement it). More importantly how can I do this in Matlab or OpenCV? I couldn't manage to do this using surfnorm command. AFAIK it needs a single surface, and I have partial surfaces in my depth image.

This is an example depth image.

enter image description here

[EDIT]

What I want to do is, after I get the surface normal at each point I will create tangent planes at those points. Then use those tangent planes to decide if that point is coming from a flat region or not by taking the sum of distances of neighbor points to the tangent plane.

guneykayim
  • 5,210
  • 2
  • 29
  • 61
  • 3
    how would you define the surface normal at a depth discontinuity point? – Shai May 27 '14 at 17:13
  • 2
    Oh, you mean points like edges, or bottom part visible at inside of the mug in this image? Well if it there is a discontinuity then surface normal is not important for me. So I don't need to define it. – guneykayim May 27 '14 at 17:16
  • Have you tried just taking finite differences of adjacent pixels? The result will likely be noisy and need some smoothing, plus you'll need to do something to handle discontinuities, but it should give you something good enough for lighting. What do you need the normals for? – mattnewport May 27 '14 at 17:50
  • I don't know what finite difference is, so I've not tried. I will look into that. I'm not going to use normals for lightning, after I got normals I will calculate tangent surface at that point. Then use that surface to determine if that point is flat or not by taking the sum of distances of neighbour points to the surface. Simply, I'm trying to understand if a point selected on the object is coming from a flat surface or not. – guneykayim May 27 '14 at 19:31
  • The idea is that you need to define regions in which you can calculate your surface normals. Generally in Computer Graphics, a triangle is the most used polygon in which a normal is calculated by [cross product of the two non-parallel sides](https://en.wikipedia.org/wiki/Normal_(geometry)#Calculating_a_surface_normal). However, for implementation purposes, I suggest you look at [this code from OpenGL forums](https://www.opengl.org/wiki/Calculating_a_Surface_Normal). HTH – scap3y May 30 '14 at 19:01

4 Answers4

8

So there are a couple of things that are undefined in your question, but I'll do my best to outline an answer.

The basic idea for what you want to do is to take the gradient of the image, and then apply a transformation to the gradient to get the normal vectors. Taking the gradient in matlab is easy:

[m, g] = imgradient(d);

gives us the magnitude (m) and the direction (g) of the gradient (relative to the horizontal and measured in degrees) of the image at every point. For instance, if we display the magnitude of the gradient for your image it looks like this:

enter image description here

Now, the harder part is to take this information we have about the gradient and turn it into a normal vector. In order to do this properly we need to know how to transform from image coordinates to world coordinates. For a CAD-generated image like yours, this information is contained in the projection transformation used to make the image. For a real-world image like one you'd get from a Kinect, you would have to look up the spec for the image-capture device.

The key piece of information we need is this: just how wide is each pixel in real-world coordinates? For non-orthonormal projections (like those used by real-world image capture devices) we can approximate this by assuming each pixel represents light within a fixed angle of the real world. If we know this angle (call it p and measure it in radians), then the real-world distance covered by a pixel is just sin(p) .* d, or approximately p .* d where d is the depth of the image at each pixel.

Now if we have this info, we can construct the 3 components of the normal vectors:

width = p .* d;
gradx = m .* cos(g) * width;
grady = m .* sin(g) * width;

normx = - gradx;
normy = - grady;
normz = 1;

len = sqrt(normx .^ 2 + normy .^ 2 + normz .^ 2);
x = normx ./ len;
y = normy ./ len;
z = normz ./ len;
Isaac
  • 3,586
  • 1
  • 18
  • 20
  • Well, you can ask me undefined parts? I thought I've defined all the necessary information. I couldn't try what you offered because my matlab version does not include ingredient operator, but I will have new version soon, so I can try it. But there is this part I didn't understand in your solution. You told like surface normal depends on viewing angle, by intuition I can say that surface normal is invariant from viewing angle and should always be same. Am I wrong? – guneykayim May 27 '14 at 19:59
  • The undefined part is what you will actually be using these vectors for, and what device will be generating your images. It would probably be possible for you to implement your own `imgradient` function if you don't feel like waiting. The surface normal depends not on the viewing angle, but on the projection used to generate the image: if you used an orthonormal projection instead of a projection with perspective to generate your image, you would get a different image despite the true normals being the same in both, and therefore you need to know the projection to recover the normals. – Isaac May 27 '14 at 20:49
  • Can you give some documents that I can read and understand better? – guneykayim May 28 '14 at 05:50
  • Understand which part? – Isaac May 28 '14 at 19:18
  • Recovering the normals using gradient and viewing angle. I did not understand how it is done. I mean the theory behind it. – guneykayim May 28 '14 at 19:20
  • Why is the normz value set to 1? – MonsieurBeilto Sep 30 '18 at 00:05
5

What mattnewport is suggesting is can be done in a pixel shader. In each pixel shader you calculate two vectors A and B and the cross product of the vectors will give you the normal. The way you calculate the two vectors is like so:

float2 du //values sent to the shader based on depth image's width and height
float2 dv //normally du = float2(1/width, 0) and dv = float2(0, 1/height)
float D = sample(depthtex, uv)
float D1 = sample(depthtex, uv + du)
float D2 = sample(depthtex, uv + dv)
float3 A = float3(du*width_of_image, 0, D1-D)
float3 B = float3(0, dv*height_of_image, D2-D)
float3 normal = AXB
return normal

This will break when there're discontinuities in the depth values.

To calculate if a surface is flat in the pixel shader the second order partial derivatives can be used. The way you calculate the second order derivatives is by calculating the finite differences and the finding the difference on that like so:

float D = sample(depthtex, uv)
float D1 = sample(depthtex, uv + du)
float D3 = sample(depthtex, uv - du)

float dx1 = (D1 - D)/du
float dx2 = (D - D3)/du
float dxx = (dx2 - dx1)/du

In the same way you have to calculate dyy, dxy and dyx. The surface is flat if dxx = dyy = dxy = dyx = 0.

Typically, you'd choose the du and dv to be 1/width and 1/height of the depth image .

All of this stuff happens on the GPU which makes everything really fast. But if you don't care about that you can run this method in the CPU as well. The only issue will be for you to replace a function like sample and implement your own version of that. It will take the depth image and u, v values as input and return a depth value at the sampled point.

Edit:

Here's a hypothetical sampling function that does nearest neighbour sampling on the CPU.

float Sample(const Texture& texture, vector_2d uv){
    return texture.data[(int)(uv.x * texture.width + 0.5)][(int)(uv.y * texture.height + 0.5];
}
Arun R
  • 873
  • 6
  • 10
  • Just FYI - [this](http://en.wikipedia.org/wiki/Second_partial_derivative_test) wiki page has more info on this. You can also read up on [Hessian matrices](http://en.wikipedia.org/wiki/Hessian_matrix) which is related to this . – Arun R Jun 02 '14 at 14:54
  • Can you explain your solution a little bit more? which variable stands for what? what is this `sample` function? Can I have details please. – guneykayim Jun 05 '14 at 08:49
  • You need to read up a bit on texture sampling to have the params figured. [Here's](http://msdn.microsoft.com/en-us/library/windows/desktop/bb206250(v=vs.85).aspx) a documentation on texture sampling. If you're not doing this on a shader, you'll have to implement a sampling function yourself on the CPU. – Arun R Jun 05 '14 at 18:57
1

I will describe what I think you have to do conceptually and provide links to the relevant parts of opencv,

To determine normal of a given (3d) point in a pointcloud:

  1. Create a kd-tree or (balltree?) representation of your point cloud so that you can efficiently compute the k nearest neighbors. Your choice of k should depend on the density of your data. http://docs.opencv.org/trunk/modules/flann/doc/flann_fast_approximate_nearest_neighbor_search http://physics.nyu.edu/grierlab/manuals/opencv/classcv_1_1KDTree.html

  2. After querying for the k-nearest neighbors of a given point p, use them to find a best fit plane. You can use PCA to do this. Set maxComponents=2. http://physics.nyu.edu/grierlab/manuals/opencv/classcv_1_1PCA.html https://github.com/Itseez/opencv/blob/master/samples/cpp/pca.cpp

  3. Step 2 should return two eigenvectors which define the plane you are interested in. The cross product of these two vectors should be (an estimation of) your desired normal vector. You can find info how to calculate this in opencv (Mat::cross) http://docs.opencv.org/modules/core/doc/basic_structures.html

samfr
  • 656
  • 1
  • 6
  • 19
0

You can refer to (5), (6) and (8) in "Efficiently Combining Positions and Normals for Precise 3D Geometry":

enter image description here

enter image description here

LogWell
  • 45
  • 7