3

I've discovered (by accident) that the last CLR does the tail call optimization. I have tested it with a piece of code, but frankly it doesn't behave the way I expected. I thought the tail call optimization may happen when the last thing in the function is a function call.

I'm trying to "break" this code to prevent form tail call op.

class Program
{
    static void Foo(int counter, int limit)
    {
        try
        {
            if (counter == limit)
            {
                return;
            }
            Foo(++counter, limit);

            int d = 1;
            d = Bar(d);
            //Console.Write(d);
            //Thread.Sleep(1);
            int baz = 0;
            var z = baz + d;
            StringBuilder b = new StringBuilder();
            b.Append("D");
        }
        catch (Exception)
        {
            throw;
        }
    }

    static int Sum(int s)
    {
        if (s == 1)
        {
            return s;
        }
        return s + Sum(s - 1);
    }

    static int Bar(int d)
    {
      return  d = 10 + d;
    }

    static void Main(string[] args)
    {
        int i = 0;

        Foo(i, 10000); // jitter 
        Sum(10000);

        Stopwatch stopwatch = new Stopwatch();
        stopwatch.Start();
        Foo(i, 10000);
        stopwatch.Stop();
        Console.WriteLine(string.Format("time of execution = {0}ms",stopwatch.ElapsedMilliseconds));

        stopwatch = new Stopwatch();
        stopwatch.Start();
        Sum(10000);
        stopwatch.Stop();
        Console.WriteLine(string.Format("time of execution = {0}ms", stopwatch.ElapsedMilliseconds));

        Console.ReadKey();
    }
}

Yet still Foo is optimized. How come?

Community
  • 1
  • 1
Lukasz Madon
  • 14,664
  • 14
  • 64
  • 108
  • Have you looked at the IL generated, or are you just basing this on the timing? – Mike Caron Jun 24 '11 at 00:31
  • @Mike: Probably based on the non-explosive stack usage, although 10000 deep might not be enough to overflow the default stack when the function is so simple. – Ben Voigt Jun 24 '11 at 00:33
  • @Ben, I dunno. `baz` isn't used for anything meaningful, so I can see the compiler omitting it entirely, and I suspect the JITter would use the same storage for `z` as `d`, so it's only 8 bytes per frame, which isn't close to being enough to blow a default 1 Meg stack. Then again, the best way to tell would be to look at the IL ;) – Mike Caron Jun 24 '11 at 00:36
  • @Mike: Isn't that what I said? (But I think you forgot to account for the return address and maybe saving some registers.) – Ben Voigt Jun 24 '11 at 00:38
  • @Ben, yup, that is what you said. Either way, registers and return addresses are for the JITter to worry about. Either way, I'm looking at the IL from the OP's code, and compiling with `/o`, I don't see any bonafide tail calls (that is, `tail` followed by a `call`, `calli` or `callvirt`). Maybe I'm reading ILDASM wrong... – Mike Caron Jun 24 '11 at 00:50
  • arg damn Reflector is no longer free:/ can't look at the IL – Lukasz Madon Jun 24 '11 at 00:54
  • @lukas, use `ildasm`, which is part of .NET. Anyway, according to this (http://stackoverflow.com/questions/491376/why-doesnt-net-c-eliminate-tail-recursion), C# doesn't do tail calls (edit: on x86) anyway, so this is all moot. – Mike Caron Jun 24 '11 at 01:03
  • 1
    Neither is optimized in mainstream jitters. Just add a 0 to make it bomb on SO. – Hans Passant Jun 24 '11 at 01:59
  • 1
    .NET's current handling of tail functions is pretty lousy. That's why the F# compiler optimizes it into a `while(true)` loop. – vcsjones Jun 24 '11 at 02:13

1 Answers1

0

You haven't done anything with side-effects after the recursive call to Foo. I assume you tried the commented out code and it DID prevent optimization. So what's the problem?

You could also write to a class member, that would be a side-effect that couldn't be discarded.

private static int dummy;
static void Foo(int counter, int limit)
{
    if (counter == limit) return;
    Foo(++counter, limit);
    dummy++;
}

and then read dummy at the end of Main.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Actually, it did not. 10 000 run of Thread.Sleep(1) takes 10 032ms. I though if I prevent optimization the stack would blow up. – Lukasz Madon Jun 24 '11 at 00:35
  • 1
    @lucas: This function doesn't use very much stack. Even 10000 nested calls isn't going to exhaust the default stack size on your thread. – Ben Voigt Jun 24 '11 at 00:36
  • I added dummy += z; to the Foo and still time is 0ms – Lukasz Madon Jun 24 '11 at 00:42
  • The JIT should not be smart enough to delete the stringbuilder stuff. It only can conclude that it's side-effect free if all of that is inlined. Also, the JIT does not delete allocations ATM. – usr Sep 17 '16 at 18:02
  • Sry, commented on an aged post. I really hate that obsolete questions are bumped. It was never useful to me. – usr Sep 17 '16 at 18:03