22

I'm trying to find an efficient algorithm to generate a simple connected graph with given sparseness. Something like:

Input:
    N - size of generated graph
    S - sparseness (numer of edges actually; from N-1 to N(N-1)/2)
Output:
    simple connected graph G(v,e) with N vertices and S edges
F0RR
  • 1,590
  • 4
  • 16
  • 30
  • This paper claims to provide an efficient algorithm for generating random connected graphs. I haven't gone through the details but they claim better performance, especially for large networks. http://complexnetworks.fr/wp-content/uploads/2011/01/random.pdf – Harsh Nov 08 '17 at 19:48

4 Answers4

47

High-Level Idea

  1. Generate a (uniformly chosen) random spanning tree with N nodes and N - 1 edges.
  2. Until the requested number of edges has been reached, add an edge between any two random nodes.

Creating the Spanning Tree

The partition-based answer by ypnos is a good start, but bias is introduced by always selecting a visited node for one end of a new edge. By randomly selecting a visited node at each iteration, nodes that are visited towards the beginning have more iterations from which they have a chance to be chosen. Therefore, earlier nodes are more likely to have a high degree (number of edges) than those picked later.

Example of Bias

As an example, for a 4 node connected graph rather than generating a linear path graph, which is what 75% of the possible spanning trees are, this kind of bias will cause the star graph to be generated with greater than the 25% probability that it should be.

the possible spanning trees for a graph of size 2, 3, and 4 nodes

Bias isn't always a bad thing. It turns out this kind of bias is good for generating spanning trees that are similar to real world computer networks. However, in order to create a truly random connected graph the initial spanning tree must be picked uniformly from the set of possible spanning trees (see Wikipedia's Uniform Spanning Tree article).

Random Walk Approach

One approach to generating a uniform spanning tree is through a random walk. Below is a quote from the paper Generating Random Spanning Trees More Quickly than the Cover Time by Wilson describing simple random walk algorithm.

Start at any vertex and do a simple random walk on the graph. Each time a vertex is first encountered, mark the edge from which it was discovered. When all the vertices are discovered, the marked edges form a random spanning tree. This algorithm is easy to code up, has small running time constants, and has a nice proof that it generates trees with the right probabilities.

This works well for a simple connected graph, however if you need an algorithm for a directed graph then read the paper further as it describes Wilson's Algorithm. Here is another resource for random spanning trees and Wilson's Algorithm.

Implementation

As I was also interested in this problem, I coded Python implementations of various approaches, including the random walk approach. Feel free to take a look at the Gist of the code on GitHub.

Below is an excerpt from the code of the random walk approach:

# Create two partitions, S and T. Initially store all nodes in S.
S, T = set(nodes), set()

# Pick a random node, and mark it as visited and the current node.
current_node = random.sample(S, 1).pop()
S.remove(current_node)
T.add(current_node)

graph = Graph(nodes)

# Create a random connected graph.
while S:
    # Randomly pick the next node from the neighbors of the current node.
    # As we are generating a connected graph, we assume a complete graph.
    neighbor_node = random.sample(nodes, 1).pop()
    # If the new node hasn't been visited, add the edge from current to new.
    if neighbor_node not in T:
        edge = (current_node, neighbor_node)
        graph.add_edge(edge)
        S.remove(neighbor_node)
        T.add(neighbor_node)
    # Set the new node as the current node.
    current_node = neighbor_node

# Add random edges until the number of desired edges is reached.
graph.add_random_edges(num_edges)
Community
  • 1
  • 1
Wesley Baugh
  • 3,720
  • 4
  • 24
  • 42
  • 2
    What is the complexity of the random walk approach? Doesn't the "pick a node randomly; ignore it if it was already used" pattern lead to potentially unbound running time? – Jules Olléon Feb 11 '14 at 20:54
  • 1
    @JulesOlléon As the paper by Wilson discusses, the [Andrei Broder algorithm](https://www.cs.cmu.edu/~15859n/RelatedWork/Broder-GenRanSpanningTrees.pdf) runs within the cover time of an undirected graph, which is bounded by `O(n^3)` for the worst graphs, but is as small as `O(n lg n)` for almost all graphs---where `n` is the number of vertices. – Wesley Baugh Mar 20 '14 at 00:39
  • What is the use of graph.add_random_edges(num_edges)? Because in the edge is already added in graph.add_edge(edge). – want_to_be_calm Jan 24 '18 at 04:15
  • @want_to_be_calm: The earlier code handles part-1 of the high-level idea by creating a graph with **N-1** edges. The second part of the original question asks how to create a graph of a requested sparseness i.e., a specific number of edges. If the ` |E| < (|N| - 1)` then additional edges will need to be added (point-2 of the high-level idea). The value of `num_edges` in the example code is ambiguous on its own. It depends on the implementation of `add_random_edged()` as to whether or not the argument specifies the total number of edges requested or the number of new edges that should be added. – Wesley Baugh Jan 25 '18 at 18:32
23

For each node you need at least one edge.

Start with one node. In each iteration, create a new node and a new edge. The edge is to connect the new node with a random node from the previous node set.

After all nodes are created, create random edges until S is fulfilled. Make sure not to create double edges (for this you can use an adjacency matrix).

Random graph is done in O(S).

Mermoz
  • 14,898
  • 17
  • 60
  • 85
ypnos
  • 50,202
  • 14
  • 95
  • 141
  • Just made something like this in Java. In addition, it's better to add random nodes while creating minimum connected graph. Otherwise, you'll always get the only vertical covering last node, no more than two verticals covering previous node and so on. Alternatively, you can just get isomorphic graph of generated one by matrix of adjacency manipulations. – Eugene Mar 10 '11 at 03:22
  • 1
    This is good and simple, but it is not O(S) if the graph is dense, because of the double edges check. I mean, the worst case is (almost) never O(S), but if the graph is dense even the expected time complexity if not O(S). – Gabor Csardi Feb 20 '14 at 14:54
  • 4
    Sadly this graph is not chosen uniformly at random in the set of all graphs with n nodes and m edges. It will generate a certain type of graph. Since some nodes are present in the graph earlier, these nodes will tend to have higher degrees. – cglacet Feb 22 '19 at 15:37
4

Based on Wesley Baugh's answer I came up with the following javascript implementation with cytoscape.js to handle graphs:

function generateRandomGraph(cy, numNode, avgDegree, weightMin, weightMax) {
  // create nodes
  for (var i = 0; i < numNode; i++) {
    cy.add({
      group: "nodes",
      data: {
        id: "n" + i
      }
    });
  }

  // perform random walks to connect edges
  var nodes = cy.nodes(),
    S = nodes.toArray(),
    T = []; // visited

  var currNodeIdx = randomIntBetween(0, S.length);
  var currNode = S[currNodeIdx];
  S.splice(currNodeIdx, 1);
  T.push(currNode);

  while (S.length > 0) {
    var neighbourNodeIdx = randomIntBetween(0, S.length);
    var neighbourNode = S[neighbourNodeIdx];
    cy.add({
      group: "edges",
      data: {
        weight: randomIntBetweenInclusive(weightMin, weightMax),
        source: currNode.id(),
        target: neighbourNode.id()
      }
    });
    S.splice(neighbourNodeIdx, 1);
    T.push(neighbourNode);
    currNode = neighbourNode;
  }

  // add random edges until avgDegree is satisfied
  while (nodes.totalDegree() / nodes.length < avgDegree) {
    var temp = sampleInPlace(nodes, 2);
    if (temp[0].edgesWith(temp[1]).length === 0) {
      cy.add({
        group: "edges",
        data: {
          weight: randomIntBetweenInclusive(weightMin, weightMax),
          source: temp[0].id(),
          target: temp[1].id()
        }
      })
    }
  }
}

generateRandomGraph(cy, 20, 2.8, 1, 20);

For complete example source code, visit my github repo :)

Community
  • 1
  • 1
Guan Gui
  • 595
  • 5
  • 8
1

Generate a minimum spanning tree using something like Prim's algorithm, and from there randomly generate additional links to the according to the sparseness you want.

dwjohnston
  • 11,163
  • 32
  • 99
  • 194