list of all paths from source to sink in directed acyclic graph

Question

Possible Duplicate:
[python]: path between two nodes

Can anyone point me to some resources on how to do this? I'm using networkx as my python library.

Thanks!

Are you guaranteed that there is only one source that all node changes eventually lead back to, and there is only one sink that all node chains eventually lead to? Or are the source and sink part of a larger graph that extends to things that lead from the source, or lead into the sink? — Omnifarious, Jul 19 '10 at 05:15
Hmmm. I don't exactly know what you mean by node change, but there are finite X distinct sources and very finite Y distinct sinks — Tyler, Jul 19 '10 at 05:21
@Tyler - And the goal is given a distinct source, find all paths to a distinct sink? — Omnifarious, Jul 19 '10 at 07:16
@Tyler - I gave an algorithm that should work, though I haven't tested it. It's more clearly correct than Alex's I think, and there's a fairly simple optimization that could be made. — Omnifarious, Jul 19 '10 at 07:36

score 6 · Accepted Answer · edited Aug 30 '11 at 12:16

This is based on Alex Martelli's answer, but it should work. It depends on the expression source_node.children yielding an iterable that will iterate over all the children of source_node. It also relies on there being a working way for the == operator to compare two nodes to see if they are the same. Using is may be a better choice. Apparently, in the library you're using, the syntax for getting an iterable over all the children is graph[source_node], so you will need to adjust the code accordingly.

def allpaths(source_node, sink_node):
    if source_node == sink_node: # Handle trivial case
        return frozenset([(source_node,)])
    else:
        result = set()
        for new_source in source_node.children:
            paths = allpaths(new_source, sink_node, memo_dict)
            for path in paths:
                path = (source_node,) + path
                result.add(path)
        result = frozenset(result)
        return result

My main concern is that this does a depth first search, it will waste effort when there are several paths from the source to a node that's a grandchild, great grandchild, etc all of source, but not necessarily a parent of sink. If it memoized the answer for a given source and sink node it would be possible to avoid the extra effort.

Here is an example of how that would work:

def allpaths(source_node, sink_node, memo_dict = None):
    if memo_dict is None:
        # putting {}, or any other mutable object
        # as the default argument is wrong 
        memo_dict = dict()

    if source_node == sink_node: # Don't memoize trivial case
        return frozenset([(source_node,)])
    else:
        pair = (source_node, sink_node)
        if pair in memo_dict: # Is answer memoized already?
            return memo_dict[pair]
        else:
            result = set()
            for new_source in source_node.children:
                paths = allpaths(new_source, sink_node, memo_dict)
                for path in paths:
                    path = (source_node,) + path
                    result.add(path)
            result = frozenset(result)
            # Memoize answer
            memo_dict[(source_node, sink_node)] = result
            return result

This also allows you to save the memoization dictionary between invocations so if you need to compute the answer for multiple source and sink nodes you can avoid a lot of extra effort.

@Tyler, yeah, I'm trying to remember exactly how it went. :-) — Omnifarious, Jul 19 '10 at 05:11
@Tyler - I realized that the algorithm I wrote only bears a passing resemblance to what you want. I'm going to think about this really hard for a bit, but in the meantime I'm deleting my answer. — Omnifarious, Jul 19 '10 at 05:22
I'm getting `TypeError: can only concatenate tuple (not "instance") to tuple` with `path = (source_node,) + path` — Tyler, Jul 19 '10 at 12:00
@Omnifarious you used mutable default argument in the `allpaths` example, which would mess things if you call the function more than once (recursion not counted) in the same program. Fixed that. — Boris Gorelik, Aug 30 '11 at 12:17

bukzor · Answer 2 · 2011-04-16T00:34:44.397

This one actually works with networkx, and it's non-recursive, which may be nice for large graphs.

def find_all_paths(graph, start, end):
    path  = []
    paths = []
    queue = [(start, end, path)]
    while queue:
        start, end, path = queue.pop()
        print 'PATH', path

        path = path + [start]
        if start == end:
            paths.append(path)
        for node in set(graph[start]).difference(path):
            queue.append((node, end, path))
    return paths

Alex Martelli · Answer 3 · 2010-07-19T14:20:13.553

I'm not sure if there are special optimizations available -- before looking for any of them, I'd do a simple recursive solution, something like (using of networkx only the feature that indexing a graph by a node gives an iterable yielding its neighbor nodes [[a dict, in networkx's case, but I'm not making use of that in particular]])...:

def allpaths(G, source_nodes, set_of_sink_nodes, path_prefix=()):
  set_of_result_paths = set()
  for n in source_nodes:
    next_from_n = []
    for an in G[n]:
      if an in set_of_sink_nodes:
        set_of_result_paths.add(path_prefix + (n, an))
      else:
        next_from_n.append(an)
    if next_from_n:
      set_of_result_paths.update(
          allpaths(G, next_from_n, set_of_sink_nodes, path_prefix + (n,)))
  return set_of_result_paths

This should be provably correct (but I'm not going to do the proof because it's very late and I'm tired and fuzzy-headed;-) and usable to verify any further optimizations;-).

First optimization I'd try would be some kind of simple memoizing: if I've already computed the set of paths from some node N to any goal node (whatever the prefix leading to N was when I did that computation), I can stash that away in a dict under key N and avoid further recomputations if and when I get to N again by a different route;-).

I would up-vote your post for its coolness, but I'd feel an obligation to verify its correctness first. — Eric Walker, Jul 19 '10 at 06:30

list of all paths from source to sink in directed acyclic graph

3 Answers3

Linked