Is there a cost to entering and exiting a C# checked block?

Question

Consider a loop like this:

for (int i = 0; i < end; ++i)
    // do something

If I know that i won't overflow, but I want a check against overflow, truncation, etc., in the "do something" part, am I better off with the checked block inside or outside the loop?

for (int i = 0; i < end; ++i)
    checked {
         // do something
    }

or

checked {
    for (int i = 0; i < end; ++i)
         // do something
}

More generally, is there a cost to switching between checked and unchecked mode?

I doubt that `checked` and `unchecked` make it to the IL as blocks. They probably simply tell the compiler to emit different code in arithmetic instructions that appear directly in them. But let me ...check. — Theodoros Chatzigiannakis, Sep 17 '15 at 17:11
@TheodorosChatzigiannakis I agree, the `checked` block just emits the `.ovf` instructions. — Eldar Dordzhiev, Sep 17 '15 at 17:13
https://en.wikipedia.org/wiki/List_of_CIL_instructions Heh, @EldarDordzhiev beat me to it - the `.ovf` versions of the arithmetic instructions are emitted - they do add a bit of an overhead versus using the "plain" arithmetic instructions. — xxbbcc, Sep 17 '15 at 17:13
@EldarDordzhiev: But would wrapping the whole of the `for` loop in `checked` cause the `++i` part to use `add.ovf` instead of `add`. So it *might* be slower? — Matt Burland, Sep 17 '15 at 17:18
So short answer, put the `checked` around the part that actually needs to be checked and not around the loop (or it'll get checked when incrementing). — Matt Burland, Sep 17 '15 at 17:22
@MattBurland It might be slower, depending on the `end` expression. If the loop iterates over an array and `i` isn't changed in the loop, then the bounds check and the overflow check are both omited. — Eldar Dordzhiev, Sep 17 '15 at 18:15
In fact there are even `checked` and `unchecked` _expressions_ (see https://msdn.microsoft.com/en-us/library/aa691349%28v=vs.71%29.aspx), so you don't even need blocks, but the other answers are correct. — Doug McClean, Sep 17 '15 at 23:30

Justin Niessner · Answer 1 · 2015-09-17T17:21:09.547

20

If you really want to see the difference, check out some generated IL. Let's take a very simple example:

using System;

public class Program 
{
    public static void Main()
    {
        for(int i = 0; i < 10; i++)
        {
            var b = int.MaxValue + i;
        }
    }
}

And we get:

.maxstack  2
.locals init (int32 V_0,
         int32 V_1,
         bool V_2)
IL_0000:  nop
IL_0001:  ldc.i4.0
IL_0002:  stloc.0
IL_0003:  br.s       IL_0013

IL_0005:  nop
IL_0006:  ldc.i4     0x7fffffff
IL_000b:  ldloc.0
IL_000c:  add
IL_000d:  stloc.1
IL_000e:  nop
IL_000f:  ldloc.0
IL_0010:  ldc.i4.1
IL_0011:  add
IL_0012:  stloc.0
IL_0013:  ldloc.0
IL_0014:  ldc.i4.s   10
IL_0016:  clt
IL_0018:  stloc.2
IL_0019:  ldloc.2
IL_001a:  brtrue.s   IL_0005

IL_001c:  ret

Now, let's make sure we're checked:

public class Program 
{
    public static void Main()
    {
        for(int i = 0; i < 10; i++)
        {
            checked
            {
                var b = int.MaxValue + i;
            }
        }
    }
}

And now we get the following IL:

.maxstack  2
.locals init (int32 V_0,
         int32 V_1,
         bool V_2)
IL_0000:  nop
IL_0001:  ldc.i4.0
IL_0002:  stloc.0
IL_0003:  br.s       IL_0015

IL_0005:  nop
IL_0006:  nop
IL_0007:  ldc.i4     0x7fffffff
IL_000c:  ldloc.0
IL_000d:  add.ovf
IL_000e:  stloc.1
IL_000f:  nop
IL_0010:  nop
IL_0011:  ldloc.0
IL_0012:  ldc.i4.1
IL_0013:  add
IL_0014:  stloc.0
IL_0015:  ldloc.0
IL_0016:  ldc.i4.s   10
IL_0018:  clt
IL_001a:  stloc.2
IL_001b:  ldloc.2
IL_001c:  brtrue.s   IL_0005

IL_001e:  ret

As you can see, the only difference (with the exception of some extra nops) is that our add operation emits add.ovf rather than a simple add. The only overhead you'll accrue is the difference is those operations.

Now, what happens if we move the checked block to include the entire for loop:

public class Program 
{
    public static void Main()
    {
        checked
        {
            for(int i = 0; i < 10; i++)
            {
                var b = int.MaxValue + i;
            }
        }
    }
}

We get the new IL:

.maxstack  2
.locals init (int32 V_0,
         int32 V_1,
         bool V_2)
IL_0000:  nop
IL_0001:  nop
IL_0002:  ldc.i4.0
IL_0003:  stloc.0
IL_0004:  br.s       IL_0014

IL_0006:  nop
IL_0007:  ldc.i4     0x7fffffff
IL_000c:  ldloc.0
IL_000d:  add.ovf
IL_000e:  stloc.1
IL_000f:  nop
IL_0010:  ldloc.0
IL_0011:  ldc.i4.1
IL_0012:  add.ovf
IL_0013:  stloc.0
IL_0014:  ldloc.0
IL_0015:  ldc.i4.s   10
IL_0017:  clt
IL_0019:  stloc.2
IL_001a:  ldloc.2
IL_001b:  brtrue.s   IL_0006

IL_001d:  nop
IL_001e:  ret

You can see that both of the add operations have been converted to add.ovf rather than just the inner operation so you're getting twice the "overhead". In any case, I'm guessing the "overhead" would be negligible for most use-cases.

edited Sep 17 '15 at 17:21

answered Sep 17 '15 at 17:18

Justin Niessner

242,243
40
408
536

But if you wrap the *loop* with `checked`, then you'll end up using `add.ovf` instead of `add` for the loop variable. – Matt Burland Sep 17 '15 at 17:20
@MattBurland - I was in the process of adding that case when you commented. See the update. – Justin Niessner Sep 17 '15 at 17:21
Is there any optimization that the generator takes when generating the machine code if a code block has only `.ovf` arithmetic instructions? Or does the CPU make any optimizations when it executes only "checked" machine code? These are broad questions: there are many generators, instruction sets, and CPUs to consider. My guess is that the answer to both questions is always no, but the possibility highlights why there is usually no substitute for reliable measurements. – Edward Brey Sep 17 '15 at 17:31
2

This MSIL isn't very relevant, because it looks like it is generated by the compiler in debug mode, without optimization. – Ben Voigt Sep 17 '15 at 18:31
1

@BenVoigt - Would the differences in this case be optimized away (I'd also probably have to come up with more detailed examples)? I'd imagine the operation would still be the same. – Justin Niessner Sep 17 '15 at 18:37
6

The conclusion is right but the reasoning is wrong. What matters is what the JIT generates as x86. IL is irrelevant. – usr Sep 17 '15 at 18:39
2

@JustinNiessner: The loop increment might be optimized (since the optimizer can prove there is no overflow). All the `nop` and useless `br.s` instructions will disappear for sure. Extra local variables will dissappear. Contrary to the rules for native code optimization, reading optimized MSIL is often easier than debug mode. – Ben Voigt Sep 17 '15 at 18:39
@BenVoigt - It also doesn't help that I don't have access to Visual Studio anymore. Don't see an option to generate Release code in DotNetFiddle. :-P – Justin Niessner Sep 17 '15 at 18:39
Ahh, that does complicate things. So this MSIL might not be generated by any version of the official Microsoft compiler? – Ben Voigt Sep 17 '15 at 18:41
@BenVoigt - You have the option of using the standard .NET 4.5 compiler or Roslyn. The above was using 4.5. – Justin Niessner Sep 17 '15 at 18:42
4

@usr: The fact that two things generate different IL doesn't mean they'll generate different x86 code, but if two things generate identical IL, any x86 code will likewise be identical. – supercat Sep 17 '15 at 21:09
@supercat still, the JIT might generate some kind of scope and cause a scope based cost. It might be a different scope than is visible at the C# level. I know for a fact that it doesn't, but this answer does not prove that. – usr Sep 19 '15 at 12:13
@usr: If two source files produce bit-for-bit identical IL files, by what means *could* they be JITted to different machine code? How would the JITter know which file was being used? To be sure, in a debug build even insignificant changes to the source file may be detected by the JIT if it examines debug metadata, but if a change to the source file doesn't affect the output of a release build, how would the JITter know about it? – supercat Sep 21 '15 at 15:12
@supercat sure, they will result in the same machine code but that code might still have a cost for entering some kind of "checked scope". "is there a cost to switching between checked and unchecked mode?" has not been completely answered (although good evidence was given). – usr Sep 21 '15 at 15:32

Theodoros Chatzigiannakis · Accepted Answer · 2015-09-17T23:48:45.923

checked and unchecked blocks don't appear at the IL level. They are only used in the C# source code to tell the compiler whether or not to pick the checking or non-checking IL instructions when overriding the default preference of the build configuration (which is set through a compiler flag).

Of course, typically there will be a performance difference due to the fact that different opcodes have been emitted for the arithmetic operations (but not due to entering or exiting the block). Checked arithmetic is generally expected to have some overhead over corresponding unchecked arithmetic.

As a matter of fact, consider this C# program:

class Program
{
    static void Main(string[] args)
    {
        var a = 1;
        var b = 2;
        int u1, c1, u2, c2;

        Console.Write("unchecked add ");
        unchecked
        {
            u1 = a + b;
        }
        Console.WriteLine(u1);

        Console.Write("checked add ");
        checked
        {
            c1 = a + b;
        }
        Console.WriteLine(c1);

        Console.Write("unchecked call ");
        unchecked
        {
            u2 = Add(a, b);
        }
        Console.WriteLine(u2);

        Console.Write("checked call ");
        checked
        {
            c2 = Add(a, b);
        }
        Console.WriteLine(c2);
    }

    static int Add(int a, int b)
    {
        return a + b;
    }
}

This is the generated IL, with optimizations turned on and with unchecked arithmetic by default:

.class private auto ansi beforefieldinit Checked.Program
    extends [mscorlib]System.Object
{    
    .method private hidebysig static int32 Add (
            int32 a,
            int32 b
        ) cil managed 
    {
        IL_0000: ldarg.0
        IL_0001: ldarg.1
        IL_0002: add
        IL_0003: ret
    }

    .method private hidebysig static void Main (
            string[] args
        ) cil managed 
    {
        .entrypoint
        .locals init (
            [0] int32 b
        )

        IL_0000: ldc.i4.1
        IL_0001: ldc.i4.2
        IL_0002: stloc.0

        IL_0003: ldstr "unchecked add "
        IL_0008: call void [mscorlib]System.Console::Write(string)
        IL_000d: dup
        IL_000e: ldloc.0
        IL_000f: add
        IL_0010: call void [mscorlib]System.Console::WriteLine(int32)

        IL_0015: ldstr "checked add "
        IL_001a: call void [mscorlib]System.Console::Write(string)
        IL_001f: dup
        IL_0020: ldloc.0
        IL_0021: add.ovf
        IL_0022: call void [mscorlib]System.Console::WriteLine(int32)

        IL_0027: ldstr "unchecked call "
        IL_002c: call void [mscorlib]System.Console::Write(string)
        IL_0031: dup
        IL_0032: ldloc.0
        IL_0033: call int32 Checked.Program::Add(int32,  int32)
        IL_0038: call void [mscorlib]System.Console::WriteLine(int32)

        IL_003d: ldstr "checked call "
        IL_0042: call void [mscorlib]System.Console::Write(string)
        IL_0047: ldloc.0
        IL_0048: call int32 Checked.Program::Add(int32,  int32)
        IL_004d: call void [mscorlib]System.Console::WriteLine(int32)

        IL_0052: ret
    }
}

As you can see, the checked and unchecked blocks are merely a source code concept - there is no IL emitted when switching back and forth between what was (in the source) a checked and an unchecked context. What changes is the opcodes emitted for direct arithmetic operations (in this case, add and add.ovf) that were textually enclosed in those blocks. The specification covers which operations are affected:

The following operations are affected by the overflow checking context established by the checked and unchecked operators and statements:

The predefined ++ and -- unary operators (§7.6.9 and §7.7.5), when the operand is of an integral type.

The predefined - unary operator (§7.7.2), when the operand is of an integral type.

The predefined +, -, *, and / binary operators (§7.8), when both operands are of integral types.

Explicit numeric conversions (§6.2.1) from one integral type to another integral type, or from float or double to an integral type.

And as you can see, a method called from a checked or unchecked block will retain its body and it will not receive any information about what context it was called from. This is also spelled out in the specification:

The checked and unchecked operators only affect the overflow checking context for those operations that are textually contained within the “(” and “)” tokens. The operators have no effect on function members that are invoked as a result of evaluating the contained expression.

In the example
class Test
{
  static int Multiply(int x, int y) {
      return x * y;
  }
  static int F() {
      return checked(Multiply(1000000, 1000000));
  }
}
the use of checked in F does not affect the evaluation of x * y in Multiply, so x * y is evaluated in the default overflow checking context.

As noted, the above IL was generated with C# compiler optimizations turned on. The same conclusions can be drawn from the IL that's emitted without these optimizations.

Turn on optimization, debug MSIL isn't interesting from a performance perspective. — Ben Voigt, Sep 17 '15 at 18:32

Eldar Dordzhiev · Answer 3 · 2015-09-17T18:29:32.233

In addition to the answers above I want to clarify how does the check perform. The only method I know is to check the OF and CF flags. The CF flag is set by unsigned arithmetic instructions whereas the OF is set by signed arithmetic instructions.

These flags can be read with the seto\setc instructions or (the most used way) we can just use the jo\jc jump instruction which will jump to the desired address if the OF\CF flag is set.

But, there's a problem. jo\jc is a "conditional" jump, which is a total pain in the *** for the CPU pipeline. So I thought may be there's another way to do that, like setting a special register to interupt the execution when overflow is detected, so I decided to find out how the Microsoft's JIT does that.

I'm sure most of you heard that Microsoft has opened sourced the subset of .NET which is named .NET Core. The source code of .NET Core includes CoreCLR, so I digged into it. The overflow detection code is generated in the CodeGen::genCheckOverflow(GenTreePtr tree) method (line 2484). It can be clearly seen that the jo instruction is used for signed overflow check and the jb (surprise!) for unsigned overflow. I haven't programmed in assembly for a long time, but it looks like jb and jc are the same instructions (they both check the carry flag only). I don't know why the JIT developers decided to use jb instead of jc because if I were a CPU-maker, I would make a branch predictor to assume jo\jc jumps as very unlikely to happen.

To sum up, there's no additional instructions invoked to switch between checked and unchecked mode, but the arithmetic operations in checked block must be noticeably slower, as long as the check is performed after every arithmetic instruction. However, I'm pretty sure that modern CPUs can handle this well.

I hope it helps.

You're talking specifically about the x86/x64 architecture, right? I think you should make that explicit, .Net runs on other architectures too. — svick, Sep 17 '15 at 20:57
The "jb" and "jc" mnemonics assemble to the same pattern of bits. I believe Intel refers to the instruction that jumps when carry is set as "jb" [destination is below than source] and the one that jumps if carry is not set as "jae" [destination above or equal to source] for consistency with "ja" [destination is above source, i.e. carry flag is clear and Z flag is not set] and "jbe" [destination below or equal to source, i.e. carry is set or Z flag is set]. — supercat, Sep 17 '15 at 21:07

Kapoor · Answer 4 · 2015-09-17T17:40:05.543

1

"More generally, is there a cost to switching between checked and unchecked mode?"

No, not in your example. The only overhead is the ++i.

In both the cases C# compiler will generate add.ovf, sub.ovf, mul.ovf or conv.ovf.

But when the loop is within checked block, there will be an additional add.ovf for ++i

edited Sep 17 '15 at 17:40

answered Sep 17 '15 at 17:33

Kapoor

1,388
11
21

Is there a cost to entering and exiting a C# checked block?

4 Answers4