The problem at hand:
given a set of N points in an D dimensional space, with all their coordinates >= 0 (in 2D the points would all be in the 1st quadrant, in 3D in the 1st octant, and so on...), remove all the points that have another point that has value bigger or equal in every coordinate.
In 2D, the result is this:
(image from Vincent Zoonekynd's answer here) and there is a simple algorithm, detailed in that answer, that runs in N*log(N)
.
With chunking I should have brought it to N*log(H)
, but optimizations on that are for another question.
I was interested in extending the solution to 3 dimensions (and possibly 4, if it's still reasonable), but my current 3D algorithm is pretty slow, cumbersome and doesn't generalize to 4D nicely:
- Sort points on the x axis, annotate the position of each point
- Initialize a sort of segment tree with N leaves, where leaves will hold the points' y values and a node will hold
max(child1, child2)
- Sort points on the z axis
- For every point from the largest z:
- Check what position it was in the x order, try to put it in the segment tree in that position
- Check first if there is a point already down (so it has > z), at an higher place (so it has > x) with a bigger y (this costs log(N), thanks tree)
- If said point is found, the current point is discarded, otherwise it's inserted and the tree is updated
This still runs in N*log(N)
, but requires 2 different sorts and a 2*N
-big structure.
Extending this would require another sort and a prohibitive 2*N^2
-big quad tree.
Are there more efficient (especially CPU-wise) approaches?
I don't think it's relevant, but I'm writing in C, the code is here.