2

Summary

So I have a directed acyclic weighted graph where each edge has an input and output node, each node has an ID and I use a dictionary to map all ingoing edges to one node by its ID and another to do the same for all outgoing edges so when presented a node ID I can tell all ingoing and outgoing edges in ~O(1) time.

Requirement

I need to be able to add new random edges (i.e. connect two random nodes) in a way where it is guaranteed that no matter how big the graph is it wont be having any cycles.

What I tried

Since I'm in full control of how to build my graph I could sort it topologically and use Kahn's Algorithm to figure if for two uniformly randomly chosen nodes N1 and N2 the graph will result in a cycle in O(n) time. The problem with that is what if it does? I'd have to try a new random pair and repeat the process until I get lucky. This sounds as if it will scale really badly with the graph since the more edge dense the graph is the more likely it is that a new random one will create a cycle.

I have also read this post: Generating a random DAG , which is similar in nature to my problem, however, I can't use the suggested solution to connect nodes based on their IDs and only connect smaller to larger IDs (nodes that came before with nodes that are new) due to other constraints that I have on the problem.

Question

Is there a way to design a structure that will only ever allow me to randomly pick between nodes none of which, if connected through a new edge, will yield a cycle irrelevant of the memory overhead? This should then be an O(1) operation.

Ilhan
  • 1,309
  • 10
  • 24
  • 1
    Should all valid edges have the same probability of being added? If not, you could first pick one node and pick the other node from the set of available valid nodes. But that is not uniformly at random. – Vincent van der Weele Jul 03 '19 at 10:24
  • All valid edges need to have the same probability of being added. I managed to visualize it easier by considering the graph G and its complement G' (i.e. everything that isn't connected) and then further removing all edges as possibilities which would yield a cycle if added. – Ilhan Jul 03 '19 at 10:41

1 Answers1

2

I have an O(1) solution for the check if an edge can be included in the graph. However it will take you worst-case O(n) to insert the edge.

You could maintain a binary reachability matrix R where R[u, v]=1 if you can reach v from u in your current graph and R[u, v]=0 if not. R can initially be computed once using Floyd-Warshall.

If you want to check if you can include an edge (u,v) you now just have to check if R[v, u] = 0. If it is R[v, u] = 1 you are constructing a circle by inserting (u,v).

The remaining problem becomes updating this structure. If you end up inserting the edge (u, v) into the graph you will set R[u, v] = 1. Additionally, all nodes that were able to reach u (R[:,u]=1) are are now able to reach all nodes reachable by v (R[v,:] = 1). Thus you will need to set your entries R[i, j] = 1 if R[i,u] = 1 and R[v:j] = 1.

Unfortunately, the update step will take O(n) in the worst case. In practice when the graph is still rather sparse you should be able to implement this efficiently with a sparse matrix representation (hash list with indicies v for each row u where R[u, v] = 1) and be much faster.

If you want to select a possible edge at random, you will have to additionally maintain and update a list of possible edges through the same structure.

SaiBot
  • 3,595
  • 1
  • 13
  • 19
  • Due to the nature of the question I'll wait a bit to see if other solutions arise before accepting. Everything I ever tried yielded similar (albeit slightly worse) results than your method. Maybe there is no way to do the entire thing in O(1) using some data structure or data mapping ? – Ilhan Jul 03 '19 at 11:21