3

Problem

I have a list of approximatly 200000 nodes that represent lat/lon position in a city and I have to compute the Minimum Spanning Tree. I know that I need to use Prim algorithm but first of all I need a connected graph. (We can assume that those nodes are in a Euclidian plan)

To build this connected graph I thought firstly to compute the complete graph but (205000*(205000-1)/2 is around 19 billions edges and I can't handle that.

Options

Then I came across to Delaunay triangulation: with the fact that if I build this "Delauney graph", it contains a sub graph that is the Minimum Spanning Tree according and I have a total of around 600000 edges according to Wikipedia [..]it has at most 3n-6 edges. So it may be a good starting point for a Minimum Spanning Tree algorithm.

Another options is to build an approximately connected graph but with that I will maybe miss important edges that will influence my Minimum Spanning Tree.

My question

Is Delaunay a reliable solution in this case? If so, is there any other reliable solution than delaunay triangulation to this problem ?

Further information: this problem has to be solved in C.

Update

This was done sucessfully by doing Delaunay triangulation, you can see result here : https://github.com/Siirko/ACM-Paris

Note : the code is quite ugly sometimes so be warned.

Siirko
  • 33
  • 4
  • Are edge costs just euclidean distance? – Edward Peters Nov 22 '22 at 21:56
  • @EdwardPeters We can assume that yes. – Siirko Nov 22 '22 at 21:58
  • Does this answer your question? [What is the simplest, easiest algorithm for finding EMST of a complete graph of order 10^5](https://stackoverflow.com/questions/55192325/what-is-the-simplest-easiest-algorithm-for-finding-emst-of-a-complete-graph-of) – Raymond Chen Nov 22 '22 at 22:05
  • 1
    As a rule, while wikipedia is not perfectly accurate, it is generally more reliable than StackOverflow answers - so if Wikipedia says it you can probably just trust that rather than ask us. I don't think that's easy to implement, though. If you have a library that will do it for you, great. – Edward Peters Nov 22 '22 at 22:05
  • Please, Delaun**a**y. –  Nov 22 '22 at 22:14
  • @YvesDaoust are you saying please use Delaunay, or please spell it correctly? :) – Edward Peters Nov 22 '22 at 22:17
  • @EdwardPeters: the typography leaves no doubt. –  Nov 22 '22 at 22:19
  • @ravenspoint bad english, only one "somewhat" should've been used – Siirko Nov 22 '22 at 22:25
  • @ravenspoint build a connected graph that won't have all edges compared to a complete graph but will probably miss important edges that will impact my MST result – Siirko Nov 22 '22 at 22:35
  • For a C/C++ implementation of the Delaunay triangulation, you may want to look at the CGAL library https://doc.cgal.org/latest/Triangulation_2/index.html or look for Shewchuk's Triangle library https://www.cs.cmu.edu/~quake/triangle.html – Gary Lucas Nov 23 '22 at 14:32

2 Answers2

3

The Delaunay triangulation of a point set is always a superset of the EMST of these points. So it is absolutely "reliable"*. And recommended, as it has a size linear in the number of points and can be efficiently built.

*When there are cocircular point quadruples, neither the triangulation nor the EMST are uniquely defined, but this is usually harmless.

  • Is it commonly available in libraries, and if not, how easy is it to implement? I started reading the wikipedia article and got scared. – Edward Peters Nov 22 '22 at 22:21
  • @EdwardPeters: my favourite version is that of Guibas & Stolfi, which is bulletproof. Unfortunately their article is focused on a very powerful data structure which is terrible to swallow. In practice, a lighter version, the quad-edge, is quite sufficient. Some ready-made implementation should exist. –  Nov 22 '22 at 22:26
2

There's a big question here of what libraries you have access to and how much you trust yourself as a coder. (I'm assuming the fact that you're new on SO should not be taken as a measure of your overall experience as a programmer - if it is, well, RIP.)

If we assume you don't have access to Delaunay and can't implement it yourself, minimum spanning trees algorithms that pre-suppose a graph aren't necessarily off limits to you. You can have the complete graph conceptually but not actually. Kruskal's algorithm, for instance, assumes you have a sorted list of all edges in your graph; most of your edges will not be near the minimum, and you do not have to compare all n^2 to find the minimum.

You can find minimum edges quickly by estimations that give you a reduced set, then refinement. For instance, if you divide your graph into a 100*100 grid, for any point p in the graph, points in the same grid square as p are guaranteed to be closer than points three or more squares away. This gives a much smaller set of points you have to compare to safely know you've found the closest.

It still won't be easy, but that might be easier than Delaunay.

Edward Peters
  • 3,623
  • 2
  • 16
  • 39