4

This question has a great answer for detecting cycles in a directed graph. Unfortunately, it does not seem easy to make a Map Reduce version of it.

Specifically, I am interested in a Map Reduce algorithm for removing cycles from a directed graph.

I have evaluated using a breadth first search (BFS) algorithm but an issue I see is that two different edges may be removed simultaneously to cut off a cycle. The impact of this scenario is that too many edges could be removed. It is important that cycles are removed while minimizing the number of edges removed.

Solutions with proofs available are preferred!

Thanks.

Community
  • 1
  • 1
Marcus Ericsson
  • 1,949
  • 1
  • 12
  • 13

2 Answers2

1

You need an iterative map reduce to implement this algorithm. See http://www.iterativemapreduce.org/ for a map-reduce framework that centers around iterative map reduces. Or http://www.johnandcailin.com/blog/cailin/breadth-first-graph-search-using-iterative-map-reduce-algorithm for a worked example of how to do a breadth-first search through a graph with Hadoop using an iterative map reduce.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • Thanks for your response. Are you certain the BFS algorithm won't remove too many edges since vertices will be evaluated simultaneously in different map reduce nodes? – Marcus Ericsson May 20 '11 at 21:19
  • @Marcus Ericsson: In the map phase emit a key which represents the whole cycle, and in the reduce phase you can collapse cycles. Thus you can collapse all cycles of one length at once. However as you progress you need to be careful to figure out which edges you have removed, and clear out any paths from your working set that just got removed in your last phase. (That or just restart if you break any cycles.) – btilly May 20 '11 at 22:06
1

Well if you want to remove all cycles, then you will end up with a tree. So no matter what algorithm you use, you will remove |E| - (n -1) edges. (if it was correct of course)

However, the question is whether the deletion of edges will lead to a disconnected graph. For this you will need to make an ordering of the edges (let's say lexicographic order). You should then always remove the the largest edge in a cycle. [I guess the proof of correctness is very direct whence: simply use Kruskal algorithm and find that they will be the same ! ]

Any spanning tree algorithm would solve the problem for you. Depending on what you want to optimize (either time or messsage complexity or any other perfomance metric), you will find different algorithms. BFS is the best for time. No algorithm can solve the problem for less than c(logn + m) message for c > 0.

There is an algoritm I like using for DAG's is called YO-YO. The description of the algorithm can be found in : http://www.site.uottawa.ca/~flocchin/CSI4509/8-yoyo11_fr.pdf

AJed
  • 578
  • 6
  • 11