0

I have a 2D array of size n representing n number of points in the 3D space, position[][] for XYZ (e.g. position[0][0] is X, position[0][1] is Y, and position[0][2] is Z coordinate of point 0.

What I need to do is to do clustering on the points, so to have n/k number of clusters of size k so that each cluster consists of the k closest points in the 3D space. For instance, if n=100 and k=5, I want to have 20 clusters of 5 points which are the closest neighbors in space.

How can I achieve that? (I need pseudo-code. For snippets preferably in Java)

What I was doing so far was a simple sorting based on each component. But this does NOT give me necessarily the closest neighbors.

  1. Sort based on X (position[0][0])
  2. Then sort based on Y (position[0][1])
  3. Then sort based on Z (position[0][2])


for (int i=0; i<position.length; i++){
  for (int j=i+1; j<position.length; j++){
    if(position[i][0] > position[i+1][0]){
      swap (position[i+1][0], position[i][0]);
     }
   }
}
// and do this for position[i][1] (i.e. Y) and then position[i+2][2] (i.e. Z)

I believe my question slightly differs from the Nearest neighbor search with kd-trees because neighbors in each iteration should not overlap with others. I guess we might need to use it as a component, but how, that's the question.

Tina J
  • 4,983
  • 13
  • 59
  • 125
  • What did your search for 3d or point cloud clustering techniques come up with so far? – Thomas Aug 14 '17 at 08:33
  • See *nearest neighbor search*: http://www.alglib.net/other/nearestneighbors.php – meowgoesthedog Aug 14 '17 at 08:35
  • I couldn't find clustering for k closest neighbors in 3D space. I came up with my own algorithm as above. – Tina J Aug 14 '17 at 08:35
  • 1
    @TinaJ I would create a low res 3D voxel map to compute the density of points per its cell. If above threshold increase resolution of that cell (recursivelly) until you forund your approx cluster positions ... then group points by the distance to them ... something like [Finding holes in 2d point sets?](https://stackoverflow.com/a/21884021/2521214) for the density map and [Effective gif/image color quantization?](https://stackoverflow.com/a/30265253/2521214) for the clustering ... – Spektre Aug 17 '17 at 21:09
  • @Spektre umm, do you mean at each voxel, decrease the density? Maybe I can do that starting from the deepest octree node and come higher layers until i fulfill my target density. Is this what you mean? That is my current solution in mind. – Tina J Aug 18 '17 at 04:27
  • 1
    @TinaJ I created an aswer with description of what I meant.... – Spektre Aug 18 '17 at 07:16
  • Btw, in an octree all points would be essentially leaves of the octree right? So deeper nodes are higher granularity, with smaller voxels sizes. Is this correct? – Tina J Aug 18 '17 at 12:35

2 Answers2

1

At start you do not have a octree but list of points instead like:

float position[n][3];

So to ease up the clustering and octree creation you can use 3D point density map. It is similar to creating histogram:

  1. compute bounding box of your points O(n)

    so process all points and determine min and max coordinates.

  2. create density map O(max(m^3,n))

    So divide used space (bbox) into some 3D voxel grid (use resolution you want/need) do a density map like:

    int map[m][m][m]`
    

    And clear it with zero.

    for (int x=0;x<m;x++)
     for (int y=0;y<m;y++)
      for (int z=0;z<m;z++)
       map[x][y][z]=0;
    

    Then process all points determine its cell position from x,y,z and increment it.

    for (int i=0;i<n;i++)
     {
     int x=(m-1)*(position[i][0]-xmin)/(xmax-xmin);
     int y=(m-1)*(position[i][1]-ymin)/(ymax-ymin);
     int z=(m-1)*(position[i][2]-zmin)/(zmax-zmin);
     map[x][y][z]++;
     // here you can add point i into octree belonging to leaf representing this cell
     }
    

    That will give you low res density map. The higher number in cell map[x][y][z] the more points are in it which means a cluster is there and you can also move point to that cluster in your octree.

This can be recursively repeated for cells that have enough points. To make your octree create density map 2x2x2 and recursively split each cell until its count is lesser then threshold or cell size is too small.

For more info see similar QAs

Spektre
  • 49,595
  • 11
  • 110
  • 380
  • This is good. One thing is we are limited by the size of map cells. What is th best size?! I was thinking about an approach which is independent of choosing a number. – Tina J Aug 18 '17 at 12:24
  • Btw, you wrote in one line to add point i into octree. So I assume this approach can be combined with an octree to store points. While also can be used to create an octree by recursion. Right? – Tina J Aug 18 '17 at 12:26
  • @TinaJ octree and recursive density map has almost the same structure. Best resolution depends on the usage and data. Usually you stop if number of points/object per cell is easily handled by the app processing for example if you're rendering there is no point to have single point per leaf when gfx card can easily handle thousands of primitives in RT. So no there is no need to have single point per leaf in octree unless you want it for some purpose ... The best octree resolution is determined by the performance of the app youre creating so divide until the performance stall or drops again ... – Spektre Aug 18 '17 at 12:50
  • I see. So in theory my assumption is correct, but in practice might not be possible for large points. – Tina J Aug 18 '17 at 13:13
  • @Spektre Can I talk with you, I'm Ahmed, Austria, please give me a contact method – andre_lamothe Feb 05 '21 at 15:11
0

What you what is not Clustering. From what you said, I think you want to divided your N points into N/k groups, with each group have k points, while keeping the points in each cluster are closest in the 3D space.

Think an easy example, if you want to do the same thing one one dimension, that is, just sort the numbers, and the first k points into cluster 1, the second k points into cluster 2, and so on.

Then return the 3D space problem, the answer is the same. Just first find the point with minimum x-axis, y-axis and z-axis, altogether with its closest k-1 points into Cluster 1. Then for the lest points, find the minimum x-axis, y-axis and z-axis points, and k-1 closest points not clustered into Cluster 2, and so on.

Above process will get your results, but that maybe not meaningful in practice, maybe cluster algorithms such as k-means could help you.

chenxingwei
  • 251
  • 1
  • 9