Are simple tail recursive functions as efficient as loops?

Question

Sometimes I find myself writing tail recursive functions. I have been searching high and low, and I have found there is tail recursion in the .NET framework, but I'm not sure in what cases I can, and in what cases I can't use tail recursion efficiently. For example, I have a simple tree, lets call it

public class Tree
{
  public Tree parent = null;

  public Tree(Tree parent)
  {
    this.parent = parent;
    this.children = new List<Tree>();
  }

  public List<Tree> children {get; private set;}

  public Tree root
  {
    get
    {
      return this.parent == null ? this : parent.root;
    }
  }
}

for the root property, will the compiler emit a loop? will it emit .tail? Will the jitter respect the .tail? will nothing fancy happen, and will the algorithm just run recursively? Most importantly, should I rewrite this to be iterative?

To receive good votes on question, please attach some results of your research effort. Can you write and time those cases you mentioned? — Peter Ivan, Dec 19 '12 at 11:32
I was hoping someone know the answer. I'm unfamilliar with the disassembler myself, and since this specific case hinges on debug or release versioning and I have no idea how this affects or might affect the JIT or the bytecode, and that if this is a JIT effect, it will be very hard to measure effectively I feared if I tried to measure it, I might draw the wrong conclusions. — Martijn, Dec 19 '12 at 11:32

leppie · Accepted Answer · 2012-12-19T11:45:57.963

9

The C# compiler will never emit tail prefix.

F# will do so, if the call is in tail position. It depends on the order you traverse the tree whether tail recursion is applicable.

In your code, there is nothing in a tail position. The reason is the usage of a ternary operator. If the code is rewritten to use if statements with each branch returning, then the call to parent.root will be in tail position.

In terms of optimization, the compiler (F# or IronScheme) will normally convert tail recursive calls into while (true) { ... } loops. This is done as it removes both the tail call and the need to call the method again.

So if C# was allowed to emit tail calls (which it is not) it would likely be transformed from:

public Tree root
{
  get
  {
    if (parent == null) return this;
    else return parent.root; // now in tail position
  }
}

to (just a guestimate)

public Tree root
{
  get
  {
    Tree temp = this;
    while (true)
    {
      if (temp.parent == null)
      {
         return temp;
      }
      temp = temp.parent;
    }
  }
}

F# and IronScheme both does the same transformation. This is called tail call elimination (TCE). And yes, it will be marginally faster than the C# version. (I have tested this by microbenchmarking fib on C#, F# and IronScheme)

edited Dec 19 '12 at 11:45

answered Dec 19 '12 at 11:31

leppie

115,091
17
196
297

Can you add some documentation reference? – Peter Ivan Dec 19 '12 at 11:36
It's been added to C# http://blogs.msdn.com/b/clrcodegeneration/archive/2009/05/11/tail-call-improvements-in-net-framework-4.aspx – Dave Bish Dec 19 '12 at 11:37
@DaveBish: clrcodegen blog refers to the CLR, and not the C# compiler. – leppie Dec 19 '12 at 11:40
@PeterIvan: documentation reference for what exactly? The ECMA spec for the CLR describes it. – leppie Dec 19 '12 at 11:42
@leppie Looks like I belatedly realized that the tail calls mentioned in the first blog I linked to described using ildasm to explicitly rewrite those calls. Guess you're right about it. – Jon Limjap Dec 19 '12 at 11:44
I suppose that it is theoretically possible to have a post-compilation process that uses the tail call IL instructions wherever it makes sense. Some friends did this for Java recently. – Drew Noakes Dec 19 '12 at 11:46
@DrewNoakes: The cost of a tail call is about 3-7 times slower than a normal call. It is not really worth to make all calls in tail positions tail calls. You need to apply TCE too to even out the benefits. – leppie Dec 19 '12 at 11:47
@leppie: I meant whether you say it from your experience/tests or because s.o. (specification) says so. Thanks. – Peter Ivan Dec 19 '12 at 12:02
@leppie, I thought that ternary operators and other similar works exactly as "sequence points". I've looked through IL of both implementation and there is no difference. – ony Dec 19 '12 at 12:05
@PeterIvan: I wrote IronScheme over the last 6 years and been researching the F# compiler's codegen a bit too (the underlying semantics of both are very close). – leppie Dec 19 '12 at 12:07
@ony: There may be subtle differences (perhaps not in this case). Semantically, the ternary operator should leave a value on the stack and have a single point of exiting the function. – leppie Dec 19 '12 at 12:10
Tail recursion in c# http://stackoverflow.com/questions/7281552/tail-recursion-optimization-happens-in-visual-studio-10-x64-debug-but-not-in-rel – Roger Johansson Dec 19 '12 at 12:21
@RogerAlsing: The JIT may apply such optimizations when the stack is under pressure. I dont know of those details. – leppie Dec 19 '12 at 12:26
1

So may I conclude: "There is no tail call elimination, loop by hand to avoid running out of stack space (and if there were, the ternary operator in my example ruins it anyway)"? – Martijn Dec 19 '12 at 12:45
6

That is close enough to correct for practical purposes. Yes, the C# compiler never emits the tail prefix, however, it is legal for the jitter to realize that a method can be made tail recursive even without the prefix. The x86 32 bit jitter never does so but the 64 bit jitter occasionally will make a C# method tail recursive for you, which is somewhat surprising. – Eric Lippert Dec 19 '12 at 15:21
If I can't count on it, I'd better not use it at all. It's just, it looks and feels so sleek. Can't you give MS a call, and ask them if you can briefly come back to make the compiler do TCE? – Martijn Dec 19 '12 at 18:07
2

as I would much fancy this, I opened a uservoice on this one: http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/3475034-add-tail-call-elimination-to-the-c-compiler – Martijn Dec 19 '12 at 23:10
@Martijn, afaik even in functional languages it is much better to use high-order function when it is possible to delegate recursion to them. – ony Dec 20 '12 at 09:15
If it's just as performant, or almost as performant, why wouldn't you use a oneliner as `return this.parent == null ? this : parent.root;`? Why would it be better to use a higher order function? What would this hypothetical higher order function even look like? – Martijn Dec 20 '12 at 10:18

score 1 · Answer 2 · edited May 23 '17 at 10:24

1

There is similar answer other answer.

Regarding speed. Tail recursion optimization isn't really different from loop for small functions. When tail call optimization fired it just replace "call" instruction (on x86) with "jmp". When doing the same through loop you'll have the same "jmp" instruction for entering next cycle. One moment you should remember is that the whole body of function will be the body of cycle and thus you should try to minimize size of recursive functions.

edited May 23 '17 at 10:24

Community

1
1

answered Dec 19 '12 at 11:59

ony

12,457
1
33
41

Note: It is not called `tail call optimization`, but rather `tail call elimination`. :) – leppie Dec 19 '12 at 12:31
@leppie, I don't think that the only that term is applicable (see [wiki](http://en.wikipedia.org/wiki/Tail_call)). I know that moment from [Haskell](http://en.wikibooks.org/wiki/Haskell/Recursion) and `mono --optimize=tailc`. Using that rule you can call `loop rewind optimization` as a `in loop explicit cmp elimination` ;) – ony Dec 20 '12 at 08:44
The wiki says both are the same... TCO just sounds functionally and semantically wrong. – leppie Dec 20 '12 at 08:52

Are simple tail recursive functions as efficient as loops?

2 Answers2