4

I posted this question earlier about dynamically compiling code in C#, and the answer has lead to another question.

One suggestion is that I use delegates, which I tried and they work well. However, they are benching at about 8.4 X slower than direct calls, which makes no sense.

What is wrong with this code?

My results, .Net 4.0, 64 bit, ran exe directly: 62, 514, 530

public static int Execute(int i) { return i * 2; }

private void button30_Click(object sender, EventArgs e)
{
    CSharpCodeProvider foo = new CSharpCodeProvider();

    var res = foo.CompileAssemblyFromSource(
        new System.CodeDom.Compiler.CompilerParameters()
        {
            GenerateInMemory = true,
            CompilerOptions = @"/optimize",                    
        },
        @"public class FooClass { public static int Execute(int i) { return i * 2; }}"
    );

    var type = res.CompiledAssembly.GetType("FooClass");
    var obj = Activator.CreateInstance(type);
    var method = type.GetMethod("Execute");
    int i = 0, t1 = Environment.TickCount, t2;
    //var input = new object[] { 2 };

    //for (int j = 0; j < 10000000; j++)
    //{
    //    input[0] = j;
    //    var output = method.Invoke(obj, input);
    //    i = (int)output;
    //}

    //t2 = Environment.TickCount;

    //MessageBox.Show((t2 - t1).ToString() + Environment.NewLine + i.ToString());

    t1 = Environment.TickCount;

    for (int j = 0; j < 100000000; j++)
    {
        i = Execute(j);
    }

    t2 = Environment.TickCount;

    MessageBox.Show("Native: " + (t2 - t1).ToString() + Environment.NewLine + i.ToString());

    var func = (Func<int, int>) Delegate.CreateDelegate(typeof (Func<int, int>), method);

    t1 = Environment.TickCount;

    for (int j = 0; j < 100000000; j++)
    {
        i = func(j);
    }

    t2 = Environment.TickCount;

    MessageBox.Show("Dynamic delegate: " + (t2 - t1).ToString() + Environment.NewLine + i.ToString());

    Func<int, int> funcL = Execute;

    t1 = Environment.TickCount;

    for (int j = 0; j < 100000000; j++)
    {
        i = funcL(j);
    }

    t2 = Environment.TickCount;

    MessageBox.Show("Delegate: " + (t2 - t1).ToString() + Environment.NewLine + i.ToString());
}
Community
  • 1
  • 1
IamIC
  • 17,747
  • 20
  • 91
  • 154
  • A wild guess would be that in the Native case, the compiler can inline the function, while in the Delegate cases it cannot, and has to perform the method calls. Can you check the generated low-level code? – ron May 16 '12 at 11:47
  • 4
    The problem is not that delegates are slow, it is that regular method calls are so very fast. When used in code like this, they should take *zero* cycles as the jitter optimizer entirely eliminates the call and inlines the code. You are mostly measuring the cost of the for() loop. – Hans Passant May 16 '12 at 11:53
  • The diassembly shows a call. However, even though it's in Release mode, I don't know if that's 100% reliable because I was in VS for that test. – IamIC May 16 '12 at 11:54
  • I have different results: Native: 1513 Dynamic delegate: 655 Delegate: 1607 – gabba May 16 '12 at 11:55
  • @HansPassant the disassembly isn't showing what you're saying. It shows a call. Also, I don't think that calls are expensive. We're talking a push, call, pop here. Quick stuff... not 8.4 X overhead IMO. – IamIC May 16 '12 at 11:56
  • 2
    @IanC It may be inlined by the JIT compiler. You should check final assembly code inside the debugger (attaching after the test has been run or the JIT won't do every optimization). – Adriano Repetti May 16 '12 at 11:57
  • That's a common side-effect of looking at dissembled code. That turns off the jitter optimizer. – Hans Passant May 16 '12 at 11:58
  • @gabba did you execute in or outside of VS? I'm not sure how native could come out slower. – IamIC May 16 '12 at 12:02
  • @IanC: A function call makes a huge difference when the "push pop call return" takes longer than the meaningful operations you need to perform in the loop. I do a fair amount of image processing, you don't use (non-inlined) function calls in tight loops if you can get away with it. – Ed S. May 16 '12 at 17:37
  • @EdS. you are correct. Please see my comment on answer #2. – IamIC May 16 '12 at 21:34

2 Answers2

5

As Hans mentions in the comments on your question, the Execute method is so simple that it's almost certainly being inlined by the jitter in your "native" test.

So what you're seeing isn't a comparison between a standard method call and a delegate invocation, but a comparison between an inlined i * 2 operation and a delegate invocation. (And that i * 2 operation probably boils down to just a single machine instruction, about as fast as you can get.)

Make your Execute methods a bit more complicated to prevent inlining (and/or do it with the MethodImplOptions.NoInlining compiler hint); then you'll get a more realistic comparison between standard method calls and delegate invocations. Chances are that the difference will be negligible in most situations:

[MethodImpl(MethodImplOptions.NoInlining)]
static int Execute(int i) { return ((i / 63.53) == 34.23) ? -1 : (i * 2); }
public static volatile int Result;

private static void Main(string[] args)
{
    const int iterations = 100000000;

    {
        Result = Execute(42);  // pre-jit
        var s = Stopwatch.StartNew();

        for (int i = 0; i < iterations; i++)
        {
            Result = Execute(i);
        }
        s.Stop();
        Console.WriteLine("Native: " + s.ElapsedMilliseconds);
    }

    {
        Func<int, int> func;
        using (var cscp = new CSharpCodeProvider())
        {
            var cp = new CompilerParameters { GenerateInMemory = true, CompilerOptions = @"/optimize" };
            string src = @"public static class Foo { public static int Execute(int i) { return ((i / 63.53) == 34.23) ? -1 : (i * 2); } }";

            var cr = cscp.CompileAssemblyFromSource(cp, src);
            var mi = cr.CompiledAssembly.GetType("Foo").GetMethod("Execute");
            func = (Func<int, int>)Delegate.CreateDelegate(typeof(Func<int, int>), mi);
        }

        Result = func(42);  // pre-jit
        var s = Stopwatch.StartNew();

        for (int i = 0; i < iterations; i++)
        {
            Result = func(i);
        }
        s.Stop();
        Console.WriteLine("Dynamic delegate: " + s.ElapsedMilliseconds);
    }

    {
        Func<int, int> func = Execute;
        Result = func(42);  // pre-jit

        var s = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            Result = func(i);
        }
        s.Stop();
        Console.WriteLine("Delegate: " + s.ElapsedMilliseconds);
    }
}
LukeH
  • 263,068
  • 57
  • 365
  • 409
  • 2
    I added the no-inline hint and re-ran the test, and you are spot on. Delegates are only about 50% slower. What was odd, though, is the call to the dynamically compiled method was consistently faster than the call the the native method by a factor of 91% of the time. I wonder why that is. – IamIC May 16 '12 at 21:33
4

It makes sense. Delegates are not function pointers. They imply type checking, security and a lot of other stuffs. They're more close to the speed of a virtual function call (see this post) even if the performance impact derives from something completely different.

For a good comparison of different invocation techniques (some of them not mentioned in the question) read this article.

Adriano Repetti
  • 65,416
  • 20
  • 137
  • 208
  • Agreed. Seems normal to be as well. I got a factor of 4 slower speed for delegates vs native calls when running in Release mode. – Darin Dimitrov May 16 '12 at 11:49
  • Wow, I had read that they're only supposed to be about half the speed. I can point to experts who think this. But... one can't argue the results. – IamIC May 16 '12 at 11:49
  • @DarinDimitrov you ran my code, but got 4 X slower? I wonder why I'm getting 8, then. – IamIC May 16 '12 at 11:50
  • @IanC I guess that a true comparison is not possible. With JIT compiling the results will be different for each machine (you can just say they're pretty slower). Moreover the test environment is too important too (I read a good article about performance tests in Java but I can't remember where). – Adriano Repetti May 16 '12 at 11:54