5

I have a 3D point clouds with million of points. I want to store these points in 3D voxel space. The number of voxles along coordinate axis are more than 3000(x), 4000(y), 1500(z), for a total of 3000*4000*1500 voxels. I need to store in a voxel; maximum number of points, min height, max height and centorid. However, 90% of voxels are empty. So it takes lot of memory to store this. Actually, I want to search 26 neighbor voxels of each voxels in later. So What is the best way to store this data in voxel space and get access to this efficiently?

Creating a multidimensional array, is not the best solution, in term of performance...please any hint?

Eran Egozi
  • 775
  • 1
  • 7
  • 18
user1286780
  • 53
  • 1
  • 4
  • 4
    At 90% empty, you still have 1.800.000.000 voxels. That is not going to fly in any case. You can store runs of voxels along one dimension, but lookup is going to be costly. – Anteru Mar 22 '12 at 19:17
  • 1
    @leftaroundabout, it could be a 4D voxel, or not? – usr Mar 22 '12 at 19:56
  • 1
    @usr voxel = volumetric picture element, volume is 3D by definition – pezcode Mar 22 '12 at 20:05
  • 1
    Volume is _not_ defined to be 3D. At least according to Wikipedia. – usr Mar 22 '12 at 20:29
  • When working with Geometrical Data, 0D is any point in space and can reside in any dimensional space; even a vector2, vector3 etc. that is used as location and not direction with magnitude is 0D space. Another words 0D space is the any arbitrary point in space. 1D space is a connection between two points (distance) and this can be a 2D or 3D vector in space where the direction and or magnitude is considered valid important information. Another words 1D is distance or a field of measure between to locales or two points... – Francis Cugler Apr 06 '16 at 02:29
  • (...continued) 2D space is when you have at minimum 3 vectors or more with interior angels of an intersection thus giving you an enclosed area. 3D space is when a plane or area of a geometrical shape is extruded with another plane or axis that has a triple perpendicular point of reference. This defines volume. If you look at pure mathematics this can be proven through simple algebraic functions and the calculus performed on them. If you have 0D space and you integrate, where there is no variable of integration and you are integrating a constant number say 3, you end up with 3x + c... – Francis Cugler Apr 06 '16 at 02:37
  • (...continued) If you have 1D space or a field of measure, any scalar value x, and you integrate you end up with 1/2x^2 + c. Now let's say you have the function f(x) = x^2 where all x is > 0. This portion of the quadratic equation represents the relationship of every length of side of a square and its area. If we integrate this we end up with 1/3x^3 + c. And now we have volume. If we do the same for volume of a cube and the length of its sides where f(x) = x^3 : x > 0, then you have 1/4x^4 +c and so on. However we do not have defined names for the higher dimensions of space. ... – Francis Cugler Apr 06 '16 at 02:41
  • (...continued) Now if you are looking at the applied sciences such as Chemistry, Physics, Biology etc., The definitions of volume can vary. Such as in physics it can have different meanings. It can be represented in such a problem as: the amount of Gas or Liquid that is required to build enough pressure within a container of a specified (volume) made of a specific material before that material reaches its breaking point. It can also mean the amount of output of sound by the sound waves amplitude. .... – Francis Cugler Apr 06 '16 at 02:44
  • (...continued) Now since area is defined in 2D Space and when we look at 3D space it does have volume, but it also has what is called "Surface Area". So with this understanding if we are able to perceive 4D space we can say that it has something of a Higher Order than Volume but it also has Surface Volume. This is all defined by the order and magnitude of the polynomial function or some other arbitrary algebraic, trigonometric, or any other type of basic mathematical functions where the order of the exponent is involved and you begin to integrate or derive from it. – Francis Cugler Apr 06 '16 at 02:45
  • (...continued) As a last observation and considering both pure mathematics and scientific models the understanding of Volume is defined by the amount of a substance within a specified region of space. So Volume can be understand in any dimension of space. – Francis Cugler Apr 06 '16 at 02:50

4 Answers4

3

Classical data structures for this kind of data are kd-Trees and octrees..

Also, you should definitely take a look at the spatial searching and sorting data structures implemented in CGAL.

hc_
  • 2,628
  • 1
  • 18
  • 19
  • What is the best data structure for this, kd tree or octree? Can we use voxels data structure with octree or .... sorry I don't have clear idea about these data structures. – user1286780 Mar 22 '12 at 22:17
  • 1
    @user1286780 in your original question you only asked about voxel data structures, so I'm not putting this as a separate answer. But your comment to this answer suggested that you are open to other data structures. Since you say, that you want to look at neighbours, you might want to have a look at our 2012 paper "Comparison of nearest-neighbor-search strategies and implementations for efficient shape registration" in Journal of Software Engineering for Robotics (JOSER), https://robotik.informatik.uni-wuerzburg.de/telematics/download/joser2012.pdf – josch Aug 17 '16 at 08:36
2

If it's really "just" millions of points, far more than 90% of the voxels will empty. I'd try a hashed multimap (std::unordered_multimap in C++11) from voxel coordinates to points. That gives you O(1) lookup, like an array. It comes with quite a lot overhead though, but it's probably the best compromise.

The only thing you need for this to work is an equality comparison in the voxel class, and a template specialisation std::hash<voxel>. You won't get "maximum number of points" implemented in any automatical way, but is that really useful anyway?

leftaroundabout
  • 117,950
  • 5
  • 174
  • 319
  • The lookup of neighbors will be costly though. – usr Mar 22 '12 at 19:57
  • @usr: not really. As I said, lookup is _O_ (1), for _any_ point. Neighbours just aren't faster than any other points. – leftaroundabout Mar 22 '12 at 20:04
  • There are 26 of them which makes for an expensive O(1). – usr Mar 22 '12 at 20:30
  • 1
    Yeah... but you're not getting around doing those. In a tree-based structure, lookup will have some _O_ (ld _n_) in it, which for millions of points is also twenty-something. You hope to get around this by having only lookups close to each other in the tree, but that's mathematically impossible because [there exists no homeomorphism from ℝ³ to one dimension](http://en.wikipedia.org/wiki/Space-filling_curve#Properties), which also means that some of the nodes in e.g. a _k_-d tree that sit in neighbouring voxels will nevertheless be separated by a plane high up in the tree. – leftaroundabout Mar 23 '12 at 00:15
  • 1
    Don't get me wrong: I agree that octrees and _k_-d trees are the right choice when you don't have a fixed discrete voxel rastering, because then you need to account for neighbouring points _by distance_, which is not possible with hashs. But once you have a raster and thereby a discrete metric, it is possible to use hashs, and better. – leftaroundabout Mar 23 '12 at 00:21
  • It looks like you really know your stuff. +1! – usr Mar 23 '12 at 11:23
1

One approach would be to back your actual data with data from a collection.

To illustrate:

struct t_voxel {
  size_t nPoints, minHeight, maxHeight, centorid;
};

struct t_voxel_id {
  uint16_t index;
};

// one dimension
class t_voxel_collection {
  // the actual voxel data needed for the indices represented by the collection of voxelId
  std::vector<t_voxel> d_voxel;
  // here, empty voxel is designated by t_voxel.index = 0
  // this collection is your primary array representation
  // these elements just refer to a unique or shared index in this->d_voxel
  std::vector<t_voxel_id> d_voxelId;
public:
  // >> the interface to access and set, which abstracts the backing collection.
  // and prohibits the client from accessing the actual data.

  t_voxel get(const size_t& idx) const {
    return this->d_voxel[this->d_voxelId[idx].index];
  }
  // ...
};

You can achieve a big drop in memory consumption this way (assuming you see the direction this is going).

That's not a complete answer, but could help in this scenario.

There are several ways to further optimize and share the voxel data in this collection, depending on your use.

justin
  • 104,054
  • 14
  • 179
  • 226
  • Thanks for your idea. hope the way you explained will be help to save memory allocation problem. I would like to know the efficient way to get their neihbour voxels in 3D i.e 26 voxels!!! any idea for this..... – user1286780 Mar 22 '12 at 22:21
  • @user1286780 using this approach, `t_voxel_collection` represents one dimension. the simple approach to introduce dimensions is use a container, e.g. `std::vector >`. similar to `t_voxel_collection`'s implementation, there may be more optimal ways to represent a dimension of collections, depending on how you want to balance memory, cpu, and so on. others have mentioned alternatives which consume less memory than mine (although mine will save a lot in the scenario), but lookup is very fast using my approach -- with a data set this large, the balance is important. – justin Mar 23 '12 at 04:08
  • Thanks justin. I would also like to get idea about lookup in following case. – user1286780 Mar 23 '12 at 08:20
  • I have a plane in a voxel (a), I want to find nearest voxels of (a) which my plane intersects (withount all 26 voxels). but this plane is not a vertical or horizontal if such case i hope that ray line method is better. Pls can you give any idea for this too? – user1286780 Mar 23 '12 at 08:28
  • @user1286780 have you tried Bresenham's line algorithm (3D)? http://stackoverflow.com/questions/5365019/ray-voxel-intersection – justin Mar 23 '12 at 09:00
  • It is only for a line detection but I have a plane (slant) so I want to take all the voxels beloging to this plane. – user1286780 Mar 23 '12 at 10:56
1

You're going to become unstuck whatever you do, even if you find a perfect memory layout for your sparse grid - that's still too much memory required. The real issue is being able to efficiently store it on disk and intelligently cache the regions of interest.

Thankfully Field3D was developed for just that.

cmannett85
  • 21,725
  • 8
  • 76
  • 119