2

Suppose I have a point cloud given in 4D space, which can be set up arbitrarily dense. These points will lie on the surface of a polytope, which is an unknown object at this point. (Just to provide some unhelpful visualization, here is a rather arbitrary projection of the point cloud which I have given - i.e. plotting the first 3 columns of the 100000x4 array):

Point cloud

My goal is to obtain the vertices of this polytope, and one of my numerous attempts to get those was computing the convex hull of the point cloud. The problem here was that the number of resulting vertices differed with the specified tolerance and there was no a priori way of checking which one to use. I also noted that the use of the qhull algorithm is not optimal here, as all given points will actually lie on the hull. Now in my former question the answer I accepted suggested setting up a face loss function and using gradient descent. This seems like a nice approach, but I haven't been able to implement it at the time. I assume this would be a linear programming question, which I unfortunately know nothing about. I scanned a few online resources and it seems that finding such extreme points is something that is regularly encountered in that setting (is that what convex optimization is about?), which is why I hope it makes sense to ask a follow up question on SO.

I also hope it doesn't violate etiquette if I paste the suggested approach here:

Let xi in R4, i = 1, ..., m, be the PCA-reduced data points.

Let F = (a, b) be a face, where a is in R4 with a • a = 1 and b is in R.

We define the face loss function L as follows, where λ+, λ- > 0 are parameters you choose. λ+ should be a very small positive number. λ- should be a very large positive number.

L(F) = sumi(λ+ • max(0, a • xi + b) - λ- • min(0, a • xi + b))

We want to find minimal faces F with respect to the loss function L. The small set of all minimal faces will describe your polytope. You can solve for minimal faces by randomly initializing F and then performing gradient descent using the partial derivatives ∂L / ∂aj, j = 1, 2, 3, 4, and ∂L / ∂b. At each step of gradient descent, constrain a • a to be 1 by normalizing.

∂L / ∂aj = sumi(λ+ • xj • [a • xi + b > 0] - λ- • xj • [a • xi + b < 0]) for j = 1, 2, 3, 4

∂L / ∂b = sumi(λ+ • [a • xi + b > 0] - λ- • [a • xi + b < 0])

Note Iverson brackets: [P] = 1 if P is true and [P] = 0 if P is false.

Can someone provide a few pointers how I should approach putting this into code? I suppose I can use Matlab's linprog for that? For instance, I am not quite certain what it means to randomly initiate a face, and if a and b are unknowns at this point.

Also, here is a link to one of the datasets.

Community
  • 1
  • 1
Denor
  • 171
  • 1
  • 8
  • Why is the qhull approach not satisfactory? What do you mean by the tolerance? – knedlsepp Mar 21 '15 at 18:34
  • As far as I understood your approach you would do the following: Mesh the convex hull with 4D simplices, find the 3D-boundary-tetrahedra, find out the normal vector of the 3D space they are lying in, and then compare adjacent tetrahedra according to this vector. Is this correct? – knedlsepp Mar 21 '15 at 18:39
  • Ok, after re-reading it is clear to me what you mean by the tolerance. But that is something you will have to specify before. Imagine the same problem in 3D. You would also have to specify this tolerance on when a node will be considered a new corner or otherwise be considered as 'in-plane' with the surrounding nodes. – knedlsepp Mar 21 '15 at 18:50
  • Could you upload such a dataset? – knedlsepp Mar 21 '15 at 20:29
  • Yes, I am fairly certain this is where the qhull approach fails, and I am aware that I would have to specify these values. But I do not see how I can obtain them in theory, and I suppose this is something the algorithm is not set to solve. I could certainly provide the array via some link, as I would think it's a bit to large to print here. – Denor Mar 21 '15 at 20:34
  • Just throwing random ideas: You could try to use a RANSAC approach to fit planes to your data. Save each plane and remove the fitting datapoints. Do this until there are no more points left. Afterwards you can compute your corner points via some intersection method of the planes. – knedlsepp Mar 21 '15 at 20:43
  • RANSAC was also mentioned in my first question, but it wasn't clear to me how I can use it in higher dimensions. My latest attempt consisted of trying to find the planes by hand, only by dot product tests / coplanarity calculations, but I haven't even gotten as far as obtaining one single plane correctly. It does distinguish between points but the results simply do not look like 2D planes at all. They don't even look like hyperplanes, but then again, the visual sanity check clearly has its limitations here. – Denor Mar 21 '15 at 21:01
  • By the way, I have added a link to my question. – Denor Mar 21 '15 at 21:03
  • And as your final result you really only want the vertices? No connectivity information, no face information? – knedlsepp Mar 21 '15 at 21:47
  • Yes, I really only need the list of vertices. – Denor Mar 21 '15 at 21:49
  • Hm. I tried the RANSAC approach and finding the planes in which the faces lie is not that difficult. The hard part is figuring out which faces are connected in the original polytope to intersect the correct planes. This looks like quite some manual effort. – knedlsepp Mar 23 '15 at 14:42

1 Answers1

3

The following uses a RANSAC-approach to fit 2D planes onto a nD point cloud. Afterwards the vertices can be found via intersection of those planes.

Algorithmic idea:

The general idea of RANSAC is really simple.

  1. Select random data points called hypothetical inliers
  2. Generate model from hypothetical inliers
  3. Find other points that fit the model well, call them consensus set
  4. Repeat steps 1-3 until having found a model with consensus set of desired magnitude.
  5. Improve the model based on the maximum consensus set.

In our example, we can use it in the following way:

  1. Select three hypothetical inliers.
  2. The plane spanned by those three points is our model.
  3. Compute orthogonal distance of the entire datasets' points to the plane. Select those points within a predefined tolerance.
  4. Repeat steps 1-3 and select the plane with the largest consensus set.
  5. (We could find a least squares fit for the plane now, but in this case that doesn't seem necessary.)
  6. Save the found plane and remove its consensus set from the points we need to search.
  7. Repeat above steps until all planes are found.
  8. Intersect the planes to get the vertices.

Code needed to implement this:

We need a bit of code to achieve this, which is beyond the scope of this answer:

Implementation:

With this the RANSAC part is pretty simple:

%% // Match planes to dataset X:
%  // Choose 3 Points randomly. Generate plane. Find points within tol.
pointsWithinTolOf = @(Points,tol,Space) ...
                      distancePointsAffineSpace(Points, Space)<tol;
availablePoints = 1:size(X,1);
[foundPlane, pointsOfPlane] = deal(cell(0));
for i = 1:maxNumPlanes
    disp(['Plane #',num2str(i)]);
    bestNumInliers = 0;
    for j = 1:numIterations
        randomPointsIdxs = availablePoints(randperm(numel(availablePoints),3));
        currentPlane = X(randomPointsIdxs,:);
        inliers = find(pointsWithinTolOf(X, PointPlaneDistTol, currentPlane));
        numInliers = numel(inliers);
        if numInliers > bestNumInliers
            bestCurrentPlane = currentPlane;
            bestInliers = inliers;
            bestNumInliers = numInliers;
        end
    end
    foundPlane{i} = bestCurrentPlane;
    pointsOfPlane{i} = bestInliers;
    availablePoints = setdiff(availablePoints, bestInliers);
    if isempty(availablePoints)
        break;
    end
end
numPlanes = numel(foundPlane);

The rest of the code is rather long, so I will just write down the basic idea and link to the code.

Once we have found our model planes we:

  • Find the edges of each plane by intersecting all model planes and checking if there are data points on the intersection line.
  • Compute the vertices of each plane by intersecting its edges.

You can download the entire code for this here.

Visualization:

It seems you are dealing with the boundary of a 3D Möbius strip made from three extruded triangles and three extruded quadrangles. Here is a 4D rotation of this strip projected into 3D (projected on your 2D screen.)

Möbius strip

knedlsepp
  • 6,065
  • 3
  • 20
  • 41
  • Very nice! That visualization alone would have been more than enough to accept this. – Denor Mar 24 '15 at 14:43
  • @Denor: It was really interesting solving this problem. There are some magic constants (the tolerances and number of iterations.) but otherwise I hope it should work on more than this single dataset. (You wouldn't even need to use the 4D-PCA version, but could use your original 6D cloud instead I guess. Maybe you'd need to adjust the tolerances for other sets though.) Just because I'm curious: Where does this data stem from and what do you need the vertices for? – knedlsepp Mar 24 '15 at 15:00
  • Well, I must admit you made more progress on this than me with much more time on my hands (then again, I am not a programmer). Using it on the original dataset is a great option as well. This point cloud is the output of a computational geometry program I wrote and it turns out these points give the orbit of a dynamical system in the moduli space of some generalized class of polygons! So, it is of interest in an area of mathematical research. In the end, I will want to use the vertices to compare different input polygons, but actually any means of classification could work likewise. – Denor Mar 24 '15 at 15:11
  • @Denor: Oh, algebraic geometry! Good luck with your research then! – knedlsepp Mar 24 '15 at 15:32