4

I am writing a traversal to find the longest path within a road. The magic part of this code is the segment.Next refers to LINQ that has specific logic applied to it like not revisiting already visited nodes. Therefore, don't point out the flaws in the travsel as thats out of scope.

What I am trying to do is reduce the number of calls on stack because sometimes the path can be a 5000 long. I know I have to make this recursive call tail recursive.

public static IEnumerable<Segment> FindLongestPath(Segment segment)
{
    var rv = new List<Segment> {segment};
    var longestPathLength = 0;
    var longestNextPaths = Enumerable.Empty<Segment>();

    foreach (var n in segment.Next)
    {
        var paths = FindLongestPath(n);
        var length = paths.Sum(p => p.LengthMeters);
        if (length > longestPathLength)
        {
            longestPathLength = length;
            longestNextPaths = paths;
        }
    }

    rv.AddRange(longestNextPaths);

    return rv;
}

How can I make this recursive call be tail recursive? I know I probably have to maintain the IEnumerable<Segment> as I travel but I am just not wrapping my head around it.

CuriousDeveloper
  • 849
  • 2
  • 8
  • 27
  • I think what you need is after the var paths = FindLongestPath(n) call, add n to a "already checked all child routes" ienumberable, which you'd check for existence of n in right before the FindLongestPath and continue the loop if it is. Of course that would have to get passed into the findlongestpath as well – Eric Lizotte Jun 09 '17 at 14:09
  • I note that not only is your method likely to blow the stack, but if it does not, the number of redundant lists it produces is *enormous*. You would do better here to make your paths *immutable stacks* that you can then push new elements onto without having to reallocate and copy the body. – Eric Lippert Jun 09 '17 at 15:04

2 Answers2

4

The answer from spender is the practical way to solve this problem without recursion: use an explicit stack or queue as a helper.

The original question and spender, in a comment, wonder how to do this algorithm in tail recursion style and continuation passing style, respectively. (CPS is a style of programming in which every call is a tail call.)

To give you the flavour of how a CPS version of this algorithm would look, let me (1) simplify the problem considerably, and (2) write the solution in ML, not C#. The simplified problem is:

  • The function children takes a node and produces a stack of child nodes.
  • The function cost gives the cost of traversing a single node.
  • The problem given is to find the cost of the maximum cost path.

First, a straightforward non-CPS solution in ML:

let rec maximum_path_cost node =
  let rec aux nodes max =
    match nodes with
    | [] -> max
    | head :: tail -> 
       let c = maximum_path_cost head in
       let new_max = if c > max then c else max in
       aux tail new_max
  in
  (cost node) + (aux (children node) 0)

Briefly: we simulate a loop with a recursive auxiliary function that accumulates the maximum seen so far. The loop condition is "is the list empty?" If yes, then the result is the maximum seen so far; if not, then we compute the cost of the current item (the head of the list), compare it to the max, and run the loop on the tail.

Note that aux is tail recursive, but maximum_path_cost is not.

In continuation passing style, maximum_path_cost takes a continuation -- in this case, a function that takes an int -- and is required to call that function with its result, rather than returning. We'll make aux do the same.

For simplicity, we will not transform cost and children into CPS.

let rec maximum_path_cost node continuation =
  let rec aux nodes max aux_continuation =
    match nodes with
    | [] -> aux_continuation max
    | head :: tail ->
       let mpcc c = 
         let new_max = if c > max then c else max in
         aux tail new_max aux_continuation
       in
       maximum_path_cost head mpcc
  in
  let ac result =
    continuation ((cost node) + result) 
  in
    aux (children node) 0 ac

I know it is hard to wrap your brain around it, but if you read through it, it should make sense. The first thing we do is call aux with the children and a current max of zero; what is the continuation of the first call to aux? To add its result to the cost of the head, and pass that along to the continuation of maximum_path_cost. When do we do that? When we've run down the entire list of child nodes and have none left.

Translating this into C# and making C# guarantee tail recursion is left as an exercise. :)

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
3

Doing this with tail recursion is going to be tricky because you'll need to hand around continuations as delegates in order to do the post-recursion processing. The code will look pretty nasty to anyone who isn't versed in the functional style.

It seems like your prime motivation here is not to blow the call-stack. You can reduce your "brain-burn" by taking a non-recursive approach. Using an explicit Queue<T>/Stack<T> (depending on whether you want to traverse depth or breadth first), rather than the implicit stack you get from non-tail-recursive method calls means that your stack is limited only by available memory.

This should get you started down that road:

public static IEnumerable<Segment> FindLongestPath(Segment segment)
{
    var queue = new Queue<Segment>(); //or a Stack<Segment> with Push and Pop
    queue.Enqueue(segment);

    while(queue.Any())
    {
        var currentSegment = queue.Dequeue();
        foreach(var seg in currentSegment.Next)
        {
            queue.Enqueue(seg);
        }
        //process currentSegment
    }
}
spender
  • 117,338
  • 33
  • 229
  • 351
  • Ah, didn't even occur to me to replace the call-stack with a stack datatype. I will consider this a solution in a few hours if no one else has any additional suggestions or input – CuriousDeveloper Jun 09 '17 at 15:04
  • 1
    @CuriousDeveloper: This is definitely the right way to go; for more information on this you can look up how explicit stacks and queues are used for depth first and breadth first traversals. – Eric Lippert Jun 09 '17 at 15:06
  • @CuriousDeveloper I'd be interested in a [CPS recursion](https://www.google.co.uk/search?q=cps+recursion) solution too, but as an aggrieved parent, today I'm having trouble wrapping my head around it too. – spender Jun 09 '17 at 15:07
  • @EricLippert This [question of mine](https://stackoverflow.com/questions/35464468/async-recursion-where-is-my-memory-actually-going) that you answered got me thinking... I was considering forming a question, but seeing as you're here... I know that the cost of state-machines is high, but I had a thought that `await` or `yield` could be used to make the continuations less continuation-like. Would this work, and would it be tail recursion if it did? – spender Jun 09 '17 at 15:14
  • 1
    @spender: In theory, yes; you would get "stackless recursion" by clever use of await. In practice, there are some difficulties. `await` checks the returned task to see if it is completed; if it is, then it just keeps on going, and then you just have your normal recursive algorithm with a bunch of extra cost. If it is not completed, then what causes it to complete in the future? – Eric Lippert Jun 09 '17 at 15:32
  • I think this article helps https://stackoverflow.com/questions/687731/breadth-first-vs-depth-first#687752 – Kixoka Jun 09 '17 at 15:37
  • I've added an answer that gives a CPS solution, in ML for brevity and convenience; see if you can wrap your brain around it. – Eric Lippert Jun 09 '17 at 15:59