11

I want to be able to generate random, undirected, and connected graphs in Java. In addition, I want to be able to control the maximum number of vertices in the graph. I am not sure what would be the best way to approach this problem, but here are a few I can think of:

(1) Generate a number between 0 and n and let that be the number of vertices. Then, somehow randomly link vertices together (maybe generate a random number per vertex and let that be the number of edges coming out of said vertex). Traverse the graph starting from an arbitrary vertex (say with Breadth-First-Search) and let our random graph G be all the visited nodes (this way, we make sure that G is connected).

(2) Generate a random square matrix (of 0's and 1's) with side length between 0 and n (somehow). This would be the adjacency matrix for our graph (the diagonal of the matrix should then either be all 1's or all 0's). Make a data structure from the graph and traverse the graph from any node to get a connected list of nodes and call that the graph G.

Any other way to generate a sufficiently random graph is welcomed. Note: I do not need a purely random graph, i.e., the graph you generate doesn't have to have any special mathematical properties (like uniformity of some sort). I simply need lots and lots of graphs for testing purposes of something else.

Here is the Java Node class I am using:

public class Node<T> {
    T data;
    ArrayList<Node> children= new ArrayList<Node>();
    ...}

Here is the Graph class I am using (you can tell why I am only interested in connected graphs at the moment):

public class Graph {
    Node mainNode;
    ArrayList<Node> V= new ArrayList<Node>();

    public Graph(Node node){
        mainNode= node;
    }
    ...}

As an example, this is how I make graphs for testing purposes right now:

//The following makes a "kite" graph G (with "a" as the main node).

/*     a-b
        |/|
        c-d
*/
Node<String> a= new Node("a");
Node<String> b= new Node("b");
Node<String> c= new Node("c");
Node<String> d= new Node("d");
a.addChild(b);
a.addChild(c);
b.addChild(a);
b.addChild(c);
b.addChild(d);
c.addChild(a);
c.addChild(b);
c.addChild(d);
d.addChild(c);
d.addChild(b);
Graph G1= new Graph(a);
Tomer Shetah
  • 8,413
  • 7
  • 27
  • 35
under_the_sea_salad
  • 1,754
  • 3
  • 22
  • 42
  • For this you can use a random data generation library such as Quickcheck for Java. However, such libraries don't usually have a built-in method for generating graphs, so this could be a little tricky. Try this out and reply if you have problems. – Robin Green Nov 24 '13 at 10:24
  • I would combine the two approaches. Initially create a simple connected graphs by connecting a random node not in the graph to a random node in it. Then from the unpicked possible vertices (`0`s in the matrix), pick a number to add to the graph to make it denser. – millimoose Nov 24 '13 at 20:12
  • @RobinGreen, although it looks like Quickcheck would help with generating primitives, I would still have to generate the `Node`s that contain those primitives (which is the much harder part). I was looking for a more explicit construction, too, without using a library. – under_the_sea_salad Nov 24 '13 at 23:28
  • 1
    @bourbaki4481472 [unrelated to this question] I see that you have deleted your [recent question](http://stackoverflow.com/questions/34561948/how-to-get-text-that-surrounds-and-includes-the-span-tags-or-any-tags-with-a). Just to let you know, here's a solution that you might be interested in https://jsfiddle.net/DerekL/Lh3fy5dr/ – Derek 朕會功夫 Jan 02 '16 at 03:18
  • @Derek朕會功夫, I deleted my question because I thought I didn't explain well enough (and I happened to have figured out how to do it since). I am looking at your solutions right now too, thanks! – under_the_sea_salad Jan 02 '16 at 03:30
  • 1
    @bourbaki4481472 I just updated the code (https://jsfiddle.net/DerekL/Lh3fy5dr/1/) to a much cleaner one... hope it helps. – Derek 朕會功夫 Jan 02 '16 at 03:41

3 Answers3

18

Whatever you want to do with your graph, I guess its density is also an important parameter. Otherwise, you'd just generate a set of small cliques (complete graphs) using random sizes, and then connect them randomly.

If I'm correct, I'd advise you to use the Erdős-Rényi model: it's simple, not far from what you originally proposed, and allows you to control the graph density (so, basically: the number of links).

Here's a short description of this model:

  1. Define a probability value p (the higher p and the denser the graph: 0=no link, 1=fully connected graph);
  2. Create your n nodes (as objects, as an adjacency matrix, or anything that suits you);
  3. Each pair of nodes is connected with a (independent) probability p. So, you have to decide of the existence of a link between them using this probability p. For example, I guess you could ranbdomly draw a value q between 0 and 1 and create the link iff q < p. Then do the same thing for each possible pair of nodes in the graph.

With this model, if your p is large enough, then it's highly probable your graph is connected (cf. the Wikipedia reference for details). In any case, if you have several components, you can also force its connectedness by creating links between nodes of distinct components. First, you have to identify each component by performing breadth-first searches (one for each component). Then, you select pairs of nodes in two distinct components, create a link between them and consider both components as merged. You repeat this process until you've got a single component remaining.

Vincent Labatut
  • 1,788
  • 1
  • 25
  • 38
  • Thank you for your explanation and link. This model might suit me. – under_the_sea_salad Nov 24 '13 at 23:22
  • Although I do have to say that I am a bit confused on the differences between the two models that the Wikipedia article talks about. – under_the_sea_salad Nov 24 '13 at 23:54
  • 2
    In the first model, you consider all possible graphs respecting some parameters (numbers of nodes and links) and you randomly pick one. In the second one, you build a graph by randomly adding links to an initially empty graph. I found the second model more suitable for implementation, so this is the one I presented in my post. – Vincent Labatut Nov 25 '13 at 12:02
3

The only tricky part is ensuring that the final graph is connected. To do that, you can use a disjoint set data structure. Keep track of the number of components, initially n. Repeatedly pick pairs of random vertices u and v, adding the edge (u, v) to the graph and to the disjoint set structure, and decrementing the component count when the that structure tells you u and v belonged to different components. Stop when the component count reaches 1. (Note that using an adjacency matrix simplifies managing the case where the edge (u, v) is already present in the graph: in this case, adj[u][v] will be set to 1 a second time, which as desired has no effect.)

If you find this creates graphs that are too dense (or too sparse), then you can use another random number to add edges only k% of the time when the endpoints are already part of the same component (or when they are part of different components), for some k.

j_random_hacker
  • 50,331
  • 10
  • 105
  • 169
  • Thank you. Good point on setting adj[u][v] to 1 in the adjacency matrix (once two disjoint sets have been identified) as this would help managing this problem. – under_the_sea_salad Nov 24 '13 at 23:24
1

The following paper proposes an algorithm that uniformly samples connected random graphs with prescribed degree sequence, with an efficient implementation. It is available in several libraries, like Networkit or igraph.

Fast generation of random connected graphs with prescribed degrees. Fabien Viger, Matthieu Latapy

Be careful when you make simulations on random graphs: if they are not sampled uniformly, then they may have hidden properties that impact simulations; alternatively, uniformly sampled graphs may be very different from the ones your code will meet in practice...

  • Hello and welcome to SO! Please read the [tour](https://stackoverflow.com/tour), and [How do I write a good answer?](https://stackoverflow.com/help/how-to-answer) Adding links is great, but explaining how they solve the question is much better. – Tomer Shetah Dec 22 '20 at 09:57
  • Thanks @TomerShetah for your comment. My link is to a paper solving precisely the question, in the context I explain in the beginning of my answer (uniform generation, prescribed degrees). Do you mean I should summarize the algorithm here? Would'nt it be redundant and overly complex? – Matthieu Latapy Dec 22 '20 at 10:43
  • You should not refer to other sites to get the whole answer. Take a look at the accepted answer for example. – Tomer Shetah Dec 22 '20 at 10:45