I am working with a (number of) directed graphs with no cycles in them, and I have the need to find all simple paths between any two nodes. In general I wouldn't worry about the execution time, but I have to do this for very many nodes during very many timesteps - I am dealing with a time-based simulation.
I had tried in the past the facilities offered by NetworkX but in general I found them slower than my approach. Not sure if anything has changed lately.
I have implemented this recursive function:
import timeit
def all_simple_paths(adjlist, start, end, path):
path = path + [start]
if start == end:
return [path]
paths = []
for child in adjlist[start]:
if child not in path:
child_paths = all_simple_paths(adjlist, child, end, path)
paths.extend(child_paths)
return paths
fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())
number = 1000
stmnt = 'all_simple_paths(adjlist, 166, 180, [])'
setup = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print 'Elapsed: %0.2f ms'%(1000*elapsed)
On my computer, I get an average of 1.5 ms per iteration. I know this is a small number, but I have to do this operation very many times.
In case you're interested, I have uploaded a small file containing the adjacency list here:
I am using adjacency lists as inputs, coming from a NetworkX DiGraph representation.
Any suggestion for improvements of the algorithm (i.e., does it have to be recursive?) or other approaches I may try are more than welcome.
Thank you.
Andrea.