Today I was going to implement a method to traverse an arbitrarily deep graph and flatten it into a single enumerable. Instead, I did a little searching first and found this:
public static IEnumerable<T> Traverse<T>(this IEnumerable<T> enumerable, Func<T, IEnumerable<T>> recursivePropertySelector)
{
foreach (T item in enumerable)
{
yield return item;
IEnumerable<T> seqRecurse = recursivePropertySelector(item);
if (seqRecurse == null) continue;
foreach (T itemRecurse in Traverse(seqRecurse, recursivePropertySelector))
{
yield return itemRecurse;
}
}
}
In theory this looks good, but in practice I've found it performs significantly more poorly than using equivalent hand-written code (as the situation arises) to go through a graph and do whatever needs to be done. I suspect this is because in this method, for every item it returns, the stack has to unwind to some arbitrarily deep level.
I also suspect that this method would run a lot more efficiently if the recursion were eliminated. I also happen to not be very good at eliminating recursion.
Does anyone know how to rewrite this method to eliminate the recursion?
Thanks for any help.
EDIT: Thanks very much for all the detailed responses. I've tried benchmarking the original solution vs Eric's solution vs not using an enumerator method and instead recursively traversing with a a lambda and oddly, the lambda recursion is significantly faster than either of the other two methods.
class Node
{
public List<Node> ChildNodes { get; set; }
public Node()
{
ChildNodes = new List<Node>();
}
}
class Foo
{
public static void Main(String[] args)
{
var nodes = new List<Node>();
for(int i = 0; i < 100; i++)
{
var nodeA = new Node();
nodes.Add(nodeA);
for (int j = 0; j < 100; j++)
{
var nodeB = new Node();
nodeA.ChildNodes.Add(nodeB);
for (int k = 0; k < 100; k++)
{
var nodeC = new Node();
nodeB.ChildNodes.Add(nodeC);
for(int l = 0; l < 12; l++)
{
var nodeD = new Node();
nodeC.ChildNodes.Add(nodeD);
}
}
}
}
nodes.TraverseOld(node => node.ChildNodes).ToList();
nodes.TraverseNew(node => node.ChildNodes).ToList();
var watch = Stopwatch.StartNew();
nodes.TraverseOld(node => node.ChildNodes).ToList();
watch.Stop();
var recursiveTraversalTime = watch.ElapsedMilliseconds;
watch.Restart();
nodes.TraverseNew(node => node.ChildNodes).ToList();
watch.Stop();
var noRecursionTraversalTime = watch.ElapsedMilliseconds;
Action<Node> visitNode = null;
visitNode = node =>
{
foreach (var child in node.ChildNodes)
visitNode(child);
};
watch.Restart();
foreach(var node in nodes)
visitNode(node);
watch.Stop();
var lambdaRecursionTime = watch.ElapsedMilliseconds;
}
}
Where TraverseOld is the original method, TraverseNew is Eric's method, and obviously the lambda is the lambda.
On my machine, TraverseOld takes 10127 ms, TraverseNew takes 3038 ms, the lambda recursion takes 1181 ms.
Is this typical that enumerator methods (with yield return) can take 3X as long as opposed to immediate execution? Or is something else going on here?