17

I've been digging into IL recently, and I noticed some odd behavior of the C# compiler. The following method is a very simple and verifiable application, it will immediately exit with exit code 1:

static int Main(string[] args)
{
    return 1;
}

When I compile this with Visual Studio Community 2015, the following IL code is generated (comments added):

.method private hidebysig static int32 Main(string[] args) cil managed
{
  .entrypoint
  .maxstack  1
  .locals init ([0] int32 V_0)     // Local variable init
  IL_0000:  nop                    // Do nothing
  IL_0001:  ldc.i4.1               // Push '1' to stack
  IL_0002:  stloc.0                // Pop stack to local variable 0
  IL_0003:  br.s       IL_0005     // Jump to next instruction
  IL_0005:  ldloc.0                // Load local variable 0 onto stack
  IL_0006:  ret                    // Return
}

If I were to handwrite this method, seemingly the same result could be achieved with the following IL:

.method static int32 Main()
{
  .entrypoint
  ldc.i4.1               // Push '1' to stack
  ret                    // Return
}

Are there underlying reasons that I'm not aware of that make this the expected behaviour?

Or is just that the assembled IL object code further optimized down the line, so the C# compiler does not have to worry about optimization?

lpmitchell
  • 181
  • 6

3 Answers3

24

The output you've shown is for a debug build. With a release build (or basically with optimizations turned on) the C# compiler generates the same IL you'd have written by hand.

I strongly suspect that this is all to make the debugger's work easier, basically - to make it simpler to break, and also see the return value before it's returned.

Moral: when you want to run optimized code, make sure you're not asking the compiler to generate code that's aimed at debugging :)

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 2
    This makes a lot of sense, thanks! As expected when compiling in release mode the IL is exactly as expected – lpmitchell Jan 25 '18 at 15:06
  • 6
    Jon's suspicions are of course correct. @lpmitchell: Something you'll see a lot in unoptimized code is that values which could be "ephemeral" -- that is, just pushed onto the evaluation stack and then popped when we're done with them -- are instead stored and read from specific local variable stack slots. This does nothing for the number 1, but imagine if instead there was an object reference; storing it and retrieving it makes it far less likely that the GC will collect the object earlier than you expect, which helps in debugging. – Eric Lippert Jan 25 '18 at 16:08
  • @EricLippert the local makes perfect sense, but is there any rationale for that `br.s` instruction, or is it just there out of convenience in the emitter code? I guess that if the compiler wanted to insert a breakpoint placeholder there, it could just emit a `nop`... – Lucas Trzesniewski Jan 25 '18 at 19:39
  • @LucasTrzesniewski: I've posted an answer following up on your question. – Eric Lippert Jan 25 '18 at 19:54
11

Jon's answer is of course correct; this answer is to follow up on this comment:

@EricLippert the local makes perfect sense, but is there any rationale for that br.s instruction, or is it just there out of convenience in the emitter code? I guess that if the compiler wanted to insert a breakpoint placeholder there, it could just emit a nop...

The reason for the seemingly senseless branch becomes more sensible if you look at a more complicated program fragment:

public int M(bool b) {
    if (b) 
      return 1; 
    else 
      return 2;
}

The unoptimized IL is

    IL_0000: nop
    IL_0001: ldarg.1
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: brfalse.s IL_000a
    IL_0006: ldc.i4.1
    IL_0007: stloc.1
    IL_0008: br.s IL_000e
    IL_000a: ldc.i4.2
    IL_000b: stloc.1
    IL_000c: br.s IL_000e
    IL_000e: ldloc.1
    IL_000f: ret

Notice that there are two return statements but only one ret instruction. In unoptimized IL, the pattern for codegen'ing a simple return statement is:

  • stuff the value you're going to return into a stack slot
  • branch/leave to the end of the method
  • at the end of the method, read the value out of the slot and return

That is, the unoptimized code uses single-point-of-return form.

In both this case and the simple case shown by the original poster, that pattern causes a "branch to next" situation to be generated. The "remove any branch to next" optimizer does not run when generating unoptimized code, so it remains.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
-4

What I'm about to write isn't really .NET specific but general, and I don't know the optimizations that .NET recognizes and uses when generating CIL. The syntax tree (and by it the grammar parser itself) recognizes return statement with following lexemes:

returnStatement ::= RETURN expr ;

where returnStatement and expr are non-terminals and RETURN is the terminal (return token) so when visiting the node for constant 1 the parser is behaving as if it's part of an expression. To further illustrate what I mean, the code for:

return 1 + 1;

would look something like this for a (virtual) machine using expression stack:

push const_1 // Pushes numerical value '1' to expression stack
push const_1 // Pushes numerical value '1' to expression stack
add          // result = pop() + pop(); push(result)
return       // pops the value on the top of the stack and returns it as the function result
exit         
nstosic
  • 2,584
  • 1
  • 17
  • 21
  • You're forgetting about a very common optimization called [constant folding](https://en.wikipedia.org/wiki/Constant_folding). When the compiler sees 1+1, it knows that's always going to be 2, no matter what. So rather than having the program add 1 and 1 at runtime, it does the addition once, during compilation. So in your pseudocode, the two `push const_1` lines would be replaced by a single `push const_2` line. – flarn2006 Nov 03 '19 at 07:23