18

I've got a collection of 10000 - 100000 spheres, and I need to find the ones farthest apart.

One simple way to do this is to simply compare all the spheres to each other and store the biggest distance, but this feels like a real resource hog of an algorithm.

The Spheres are stored in the following way:

Sphere (float x, float y, float z, float radius);

The method Sphere::distanceTo(Sphere &s) returns the distance between the two center points of the spheres.

Example:

Sphere *spheres;
float biggestDistance;

for (int i = 0; i < nOfSpheres; i++) {
    for (int j = 0; j < nOfSpheres; j++) {
        if (spheres[i].distanceTo(spheres[j]) > biggestDistance) {
            biggestDistance = spheres[i].distanceTo(spheres[j]) > biggestDistance;
        }
    }
}

What I'm looking for is an algorithm that somehow loops through all the possible combinations in a smarter way, if there is any.

The project is written in C++ (which it has to be), so any solutions that only work in languages other than C/C++ are of less interest.

user1118321
  • 25,567
  • 4
  • 55
  • 86
illuzive
  • 349
  • 2
  • 11
  • Just to clarify, you want something that is better than O(n^2) (which is bad). Hmmm..... – Craig Feb 16 '10 at 21:34
  • 2
    Does distanceTo() obey the triangle inequality? is distanceTo() center to center or surface to surface? Do the spheres have coordinates in a vector space? Is it in a euclidean space or are there black holes around curving the metric? Where did you get all those spheres, anyway? Is there anything interesting inside the spheres that could be sold on eBay, because you seem to have a lot of them? – Paul Feb 16 '10 at 21:35
  • @Craig If they were on the real line it would be n log(n) to sort but only O(n) to find the min and max coordinate. That assumes that radius or surface is not an issue – Paul Feb 16 '10 at 21:36
  • A BFS search provides O(b^d) complexity or O(|E| + |V|) – Vivin Paliath Feb 16 '10 at 21:42
  • @Mark: Yes, it's homework. How did you guess that? (Besides the fact that CS teachers loooove big collections of data that gets out of hand.) @Craig: Yes, I know there are a few improvements that can be made - but I've got a feeling that I'm missing something. =P @Paul: Lol, I think the inside of the spheres are full of void. If you want to buy some of it I'm sure we can arrange something. ;-) – illuzive Feb 16 '10 at 21:44
  • @illuzive: It looks a lot like homework. It's OK to ask for help with programming related homework here, but a StackOverflow best-practice is to tag homework questions with the `homework` tag. – Mark Byers Feb 16 '10 at 21:47
  • @Mark: Ahh,I see. I'm new here so I didn't know that. I'll make sure to tag it correctly in the future. Thanks for the info. =) – illuzive Feb 16 '10 at 22:03
  • This would be a WAY better homework question if Sphere::distanceTo(Sphere &s) returned the min distance between points on the two spheres rather than the distance between the centers. Why have spheres at all? – IV. Feb 17 '10 at 00:15
  • This is only a small part of a bigger assignment, where the spheres are used in different ways. – illuzive Feb 17 '10 at 00:29

4 Answers4

33

The largest distance between any two points in a set S of points is called the diameter. Finding the diameter of a set of points is a well-known problem in computational geometry. In general, there are two steps here:

  1. Find the three-dimensional convex hull composed of the center of each sphere -- say, using the quickhull implementation in CGAL.

  2. Find the points on the hull that are farthest apart. (Two points on the interior of the hull cannot be part of the diameter, or otherwise they would be on the hull, which is a contradiction.)

With quickhull, you can do the first step in O(n log n) in the average case and O(n2) worst-case running time. (In practice, quickhull significantly outperforms all other known algorithms.) It is possible to guarantee a better worst-case bound if you can guarantee certain properties about the ordering of the spheres, but that is a different topic.

The second step can be done in Ω(h log h), where h is the number of points on the hull. In the worst case, h = n (every point is on the hull), but that's pretty unlikely if you have thousands of random spheres. In general, h will be much smaller than n. Here's an overview of this method.

John Feminella
  • 303,634
  • 46
  • 339
  • 357
  • 7
    +1: I have deleted my post after seeing this one as this one is a superset and better answer. –  Feb 16 '10 at 21:48
  • 1
    I would have been more descriptive about the steps, but this is homework. However, CGAL provides source code, so looking at that is a pretty good start to understanding what's happening here. – John Feminella Feb 16 '10 at 21:58
  • 2
    After some brief Googling, I thought I would add this link as well: http://www.personal.kent.edu/~rmuhamma/Compgeometry/MyCG/ConvexHull/quickHull.htm – Michael Mior Feb 17 '10 at 15:45
  • @JohnFeminella - that overview link appears to be dead. I'm working on a similar problem which led me here and I clicked that link. :-( – ppetree Sep 21 '18 at 20:46
3

Could you perhaps store these spheres in a BSP Tree? If that's acceptable, then you could start by looking for nodes of the tree containing spheres which are furthest apart. Then you can continue down the tree until you get to individual spheres.

Michael Mior
  • 28,107
  • 9
  • 89
  • 113
  • I've thought about a tree structure too, but I haven't looked into the BSP Tree yet, thanks for your reply. =) – illuzive Feb 16 '10 at 21:46
2

Your problem looks like something that could be solved using graphs. Since the distance from Sphere A to Sphere B is the same as the distance from Sphere B to Sphere A, you can minimize the number of comparisons you have to make.

I think what you're looking at here is known as an Adjacency List. You can either build one up, and then traverse that to find the longest distance.

Another approach you can use will still give you an O(n^2) but you can minimize the number of comparisons you have to make. You can store the result of your calculation into a hash table where the key is the name of the edge (so AB would hold the length from A to B). Before you perform your distance calculation, check to see if AB or BA exists in the hash table.

EDIT

Using the adjacency-list method (which is basically a Breadth-First Search) you get O(b^d) or worst-case O(|E| + |V|) complexity.

Vivin Paliath
  • 94,126
  • 40
  • 223
  • 295
2

Paul got my brain thinking and you can optimize a bit by changing

for (int j=0; j < nOfSpheres; j++) 

to

for (int j=i+1;  j < nOfSpheres; j++) 

You don't need to compare sphere A to B AND B to A. This will make the search O(n log n).

--- Addition -------

Another thing that makes this calculation expensive is the DistanceTo calulations.

distance = sqrt((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2)

That's alot of math. You can trim that down by checking to see if

((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 > maxdist^2

Removes the sqrt until the end.

Craig
  • 11,614
  • 13
  • 44
  • 62