1

There are 3 codes which do the same thing, however their performance differS in x64 release.

I guess it is because of Branch Prediction. Anyone could elaborate further?

Conditional: takes 41 ms

for (int j = 0; j < 10000; j++)
{
    ret = (j * 11 / 3 % 5) + (ret % 11 == 4 ? 2 : 1);
}

Normal: takes 51 ms

for (int j = 0; j < 10000; j++)
{
    if (ret % 11 == 4)
    {
        ret = 2 + (j * 11 / 3 % 5);
    }
    else
    {
        ret = 1 + (j * 11 / 3 % 5);
    }
}

Cached: takes 44 ms

for (int j = 0; j < 10000; j++)
{
    var tmp = j * 11 / 3 % 5;
    if (ret % 11 == 4)
    {
        ret = 2 + tmp;
    }
    else
    {
        ret = 1 + tmp;
    }
}
colinfang
  • 20,909
  • 19
  • 90
  • 173
  • 1
    How are you measuring? I don’t believe that the differences are significant. In particular, the last two codes should be equal unless the JIT screws up big time. – Konrad Rudolph Sep 17 '12 at 10:36
  • 2
    could you show the measuring code? – Jodrell Sep 17 '12 at 10:45
  • 1
    Have a look at this http://stackoverflow.com/questions/2259741/is-the-conditional-operator-slow – V4Vendetta Sep 17 '12 at 10:47
  • Just to make sure: you're testing this in Release mode without the debugger attached, right? – svick Sep 17 '12 at 11:03
  • Your samples do not support the title. The first simply offers a larger target for optimization (both compilers). – H H Sep 17 '12 at 11:10
  • @HenkHolterman I know it is a bit inappropriate but I cannot think of a better one. Feel free to edit it if you came up with a better title. – colinfang Sep 17 '12 at 11:14
  • We could do with knowing not only how you're measuring the code but how often. If you've only run the code once, the timings are hardly likely to be representative - hence my suggestion of looking deeper at the intermediate code. – Robbie Dee Sep 17 '12 at 11:43
  • The quick answer to these sorts of questions is often to look at the CIL (or MSIL in old money). – Robbie Dee Sep 17 '12 at 10:55

2 Answers2

2

EDIT 3 If I go back to the original test with the timing error fixed, I get output similar to this.

Conditional took 67ms

Normal took 83ms

Cached took 73ms

Which shows that the Ternary/Conditional operator can be marginally faster in a for loop. Given the previous finding, that when the logical branch is abstracted out of the loop, the if block beats the Ternary/Conditional operator, we can infer that compiler is able to make additional optimizations when the Conditional/Ternary operator is used iteratively, in at least some cases.

It is not clear to me why these optimizations do not apply or, are not applied, to the standard if block. The actual differential is fairly minor and, I argue, a moot point.

EDIT 2

There is infact a glaring error in my test code highlighted here

The Stopwatch is not reset between calls, when I use Stopwatch.Restart instead of Stopwatch.Start and up the iterations to 1000000000, I get the results

Conditional took 22404ms

Normal took 21403ms

This is more like the result I was expecting and borne out by the extracted CIL. So the "normal" if is in fact marginally faster then the Ternary\Conditional operator, when isolated from surrounding code.

EDIT

After my investigations outlined below, I would suggest that when using a logical condition to choose between two constants or literals the Conditional/Ternary operator can be significantly faster than the standard if block. In my tests, it was roughly twice as fast.

However I can't quite work out why. The CIL produced by the normal if is longer but for both functions the average execution path seems to be six lines, including 3 loads and 1 or 2 jumps, any ideas?.


Using this code,

using System.Diagnostics;

class Program
{
    static void Main()
    {
        var stopwatch = new Stopwatch();

        var conditional = Conditional(10);
        var normal = Normal(10);
        var cached = Cached(10);

        if (new[] { conditional, normal }.Any(x => x != cached))
        {
            throw new Exception();
        }

        stopwatch.Start();
        conditional = Conditional(10000000);
        stopWatch.Stop();
        Console.WriteLine(
            "Conditional took {0}ms", 
            stopwatch.ElapsedMilliseconds);

        ////stopwatch.Start(); incorrect
        stopwatch.Restart();
        normal = Normal(10000000);
        stopWatch.Stop();
        Console.WriteLine(
            "Normal took {0}ms", 
            stopwatch.ElapsedMilliseconds);

        ////stopwatch.Start(); incorrect
        stopwatch.Restart();
        cached = Cached(10000000);
        stopWatch.Stop();
        Console.WriteLine(
            "Cached took {0}ms", 
            stopwatch.ElapsedMilliseconds);

        if (new[] { conditional, normal }.Any(x => x != cached))
        {
            throw new Exception();
        }

        Console.ReadKey();
    }

    static int Conditional(int iterations)
    {
        var ret = 0;
        for (int j = 0; j < iterations; j++)
        {
            ret = (j * 11 / 3 % 5) + (ret % 11 == 4 ? 2 : 1);
        }

        return ret;
    }

    static int Normal(int iterations)
    {
        var ret = 0;
        for (int j = 0; j < iterations; j++)
        {
            if (ret % 11 == 4)
            {
                ret = 2 + (j * 11 / 3 % 5);
            }
            else
            {
                ret = 1 + (j * 11 / 3 % 5);
            }
        }

        return ret;
    }

    static int Cached(int iterations)
    {
        var ret = 0;
        for (int j = 0; j < iterations; j++)
        {
            var tmp = j * 11 / 3 % 5;
            if (ret % 11 == 4)
            {
                ret = 2 + tmp;
            }
            else
            {
                ret = 1 + tmp;
            }
        }

        return ret;
    }
}

Compiled in x64 Release Mode, with optimizations, run without a debugger attached. I get this output,

Conditional took 65ms

Normal took 148ms

Cached took 217ms

and no exception is thrown.


Using ILDASM to disassemble the code I can confirm that CIL for the three methods differs, the code for Conditional method being somewhat shorter.


To really answer the "why" question, I would need to understand the code of the compiler. I would probably need to know why the compiler was written that way.


You can break this down even further, so that you actually compare just logical functions and ignore all other activity.

static int Conditional(bool condition, int value)
{
    return value + (condition ? 2 : 1);
}

static int Normal(bool condition, int value)
{
    if (condition)
    {
        return 2 + value;
    }

    return 1 + value;
}

Which you could iterate with

static int Looper(int iterations, Func<bool, int, int> operation)
{
    var ret = 0;
    for (var j = 0; j < iterations; j++)
    {
        var condition = ret % 11 == 4;
        var value = ((j * 11) / 3) % 5;
        ret = operation(condition, value);
    }
}

This tests still show a performance differential but, now the other way, simplified IL below.

... Conditional ...
{
     : ldarg.1      // push second arg
     : ldarg.0      // push first arg
     : brtrue.s T   // if first arg is true jump to T
     : ldc.i4.1     // push int32(1)
     : br.s F       // jump to F
    T: ldc.i4.2     // push int32(2)
    F: add          // add either 1 or 2 to second arg
     : ret          // return result
}

... Normal ...
{
     : ldarg.0      // push first arg
     : brfalse.s F  // if first arg is false jump to F
     : ldc.i4.2     // push int32(2)
     : ldarg.1      // push second arg
     : add          // add second arg to 2
     : ret          // return result
    F: ldc.i4.1     // push int32(1)
     : ldarg.1      // push second arg
     : add          // add second arg to 1
     : ret          // return result
}
Community
  • 1
  • 1
Jodrell
  • 34,946
  • 5
  • 87
  • 124
  • When you're interested in speed and optimizations, IL is at most half the story. – H H Sep 17 '12 at 15:08
  • 1
    @HenkHolterman, Indeed, it seems so, I moved to a more specific question to see if anyone has the other half. http://stackoverflow.com/q/12462418/659190 – Jodrell Sep 17 '12 at 15:28
  • @HenkHolterman Agreed - it certainly wouldn't be your first choice. However, when you have 2 trivial chunks of code that you believe should do the same thing behind the scenes, this is often your first port of call. As a case in point, check out the seemingly harmless setting of a integer value type to zero... – Robbie Dee Sep 17 '12 at 15:37
  • Hi did u recheck the speed of the original code other than the modified code which "put the functionality for loop into a function"? Do a little math to fix the stopwatcher 65 148 217 -> 65 83 69 is exactly what I got in my question – colinfang Sep 18 '12 at 08:33
  • @colinfang, I did, I've updated my answer and yes in one way, I have reinforced your question. – Jodrell Sep 18 '12 at 09:29
1

There are 3 codes which do the same thing, however their performance differs

That's not so surprising, is it? Write things a little bit different and you get different timings.

I guess it is because of Branch Prediction.

That could explain, for a part, why the first snippet is faster. But notice that ?: is still branching.
The other thing to note is that it is simply 1 large expression, the ideal territory for an optimizer.

The problem is that you cannot look at code like this and conclude that a certain operator is faster/slower. The surrounding code matters at least as much.

H H
  • 263,252
  • 30
  • 330
  • 514