5

We have already seen that spanning trees and cuts are intimately related. Here is another connection. Let’s remove the last edge that Kruskal’s algorithm adds to the spanning tree; this breaks the tree into two components, thus defining a cut (S,S) in the graph. What can we say about this cut? Suppose the graph we were working with was unweighted, and that its edges were ordered uniformly at random for Kruskal’s algorithm to process them. Here is a remarkable fact: with probability at least 1/n^2, (S,S) is the minimum cut in the graph, where the size of a cut (S, S) is the number of edges crossing between S and S. This means that repeating the process O(n^2) times and outputting the smallest cut found yields the minimum cut in G with high probability: an O(mn^2 log n) algorithm for unweighted minimum cuts. Some further tuning gives the O(n^2 log n) minimum cut algorithm, invented by David Karger, which is the fastest known algorithm for this important problem.

  • Doesn't this depend on the fact that there are n^2 unique ways to process the graph through Kruskal's algorithm? I mean if there are only 3 unique ways for Kruskal's algorithm to process a graph with 10 nodes, repeating the process n^2 times will not produce n^2 unique "last edges". How would it work in a scenario where there are fewer than n^2 unique final cuts (that is less than n^2 unique "last edges")?

  • What if there are fewer than n^2 edges in total? For example you could have a connected graph of 10 nodes with only 9 edges, meaning regardless of how many times you repeat the algorithm, you won't have n^2 unique "last edges". How would it work in this situation?

  • Wouldn't it be easier to just loop through every edge and check if the edge is the minimum cut? In a graph of n nodes, the maximum number of unique edges is n + n-1 + n-2... + 1 edges, which is less than n^2. And considering n^2 is less than n^2 log n, why not just loop through all edges since this is faster?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
fdh
  • 5,256
  • 13
  • 58
  • 101

1 Answers1

4

I think that you might be misinterpreting how the algorithm works. The algorithm works by running Kruskal's algorithm until the last edge would be added, then stopping right before this. The algorithm does not try to build up a collection of these "last edges;" instead, repeatedly runs O(n2) random iterations of Kruskal's algorithm in order to build up O(n2) possible cuts. Taking the lowest cut among all of these candidate cuts then gives the minimum cut with high probability. In other words, it doesn't matter if there are fewer than O(n2) edges. What matters is the cut that remains at the end, not the last edge that was considered.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • Thanks! What about my final question? If the minimum cut is guaranteed to be an edge, why not just loop through all edges and check each one? This would only take O(n^2) time wouldn't it? – fdh Jul 06 '12 at 20:09
  • @Riddler- I think you are misinterpreting what a cut is. A cut is **not** a single edge in the graph. Rather, it's a set of the edges that, when removed, will disconnect the graph and leave two unconnected regions. Finding the minimum cut with a naive approach would thus be "check all possible sets of edges to see if they're cuts, then return the least of them." This would take exponential time. Does that make sense? – templatetypedef Jul 06 '12 at 20:15
  • Yes it does. One last thing: how long would it take to do it with the naive approach?(finding all cuts and returning the smallest one). I know you said exponential, but what exactly? – fdh Jul 06 '12 at 22:51
  • There are 2^m sets of edges to check, and it takes time O(m) time to generate and test each, so O(m2^m) at least. – templatetypedef Jul 06 '12 at 22:59
  • @Riddler- It's the number of edges. By convention, m is the number of edges, and n is the number of nodes. – templatetypedef Jul 06 '12 at 23:56