2

I'm missing serious optimizations because the JIT won't inline a lot of my methods.

For example lets have the following code:

static void Main(string[] args)
{
    IsControl('\0');
}

public static bool IsControl(char c)
{
    return ((c >= 0 && c <= 31) || (c >= 127 && c <= 159));
}

Produces the following after JIT compilation:

0000001f  xor         ecx,ecx 
00000021  call        FFFFFFFFFFEC9760 
00000026  mov         byte ptr [rsp+20h],al 
0000002a  nop 
0000002b  jmp         000000000000002D 
0000002d  add         rsp,38h 
00000031  rep ret 

Note that 0000001f is where I set the breakpoint. As you can see there is a call at 00000021, that is absolutely wrong. Why would such a tiny method not be qualified for inlining? For the note, this was compiled with optimization on.

user229044
  • 232,980
  • 40
  • 330
  • 338
Leopold Asperger
  • 925
  • 1
  • 6
  • 12
  • 3
    Are you checking this with a release build? Because in Debug the compiler will preserve all functions such that the application can be debugged. – MicroVirus May 24 '14 at 22:02
  • Just clearing the obvious :) Does the same thing happen when IsControl is private? – MicroVirus May 24 '14 at 22:05
  • You might be able to trick the JIT by converting the short-circuit or (`||`) to binary or (`|`). The same for and. This might reduce code IL size and get below the 32 bytes inlining limit. Besides that, there is little hope. Inline manually. Or maybe you can reduce IL size even more by saying `c & ~0x1F == 0` as a replacement for the first part of the condition. – usr May 24 '14 at 22:11
  • @usr even if I replace the expression with true, it still won't inline the method. I knew C# was not designed for this stuff, but making inlining predictable is the least they could do. – Leopold Asperger May 24 '14 at 22:19
  • If the JIT does not inline `return true;` then something is wrong. You are not starting the program with the debugger attached, right? Does it show "optimized=yes" in the loaded module list? – usr May 24 '14 at 22:23
  • @usr no it doesn't, and that could be very interesting right? – Leopold Asperger May 24 '14 at 22:35
  • 3
    This is almost absolutely certainly a problem in how you run the code and not in the JIT itself. It's rather annoying that C# doesn't really have any command line flags to dump assembled code but requires the VS tricks, but alas what can we do. Read [this](http://blogs.msdn.com/b/clrcodegeneration/archive/2007/10/19/how-to-see-the-assembly-code-generated-by-the-jit-using-visual-studio.aspx) and take particularly care of step #3 - which I'm pretty sure is what you're missing. – Voo May 25 '14 at 01:25
  • Finally I'm able to read the optimized JIT code now. Thanks @Voo – Leopold Asperger May 25 '14 at 07:28

5 Answers5

8

There is no way to require the JIT compiler inline your methods, aside from using a ahead-of-time source or bytecode transformation to inline the instructions before they ever reach the JIT.

If your algorithm is so sensitive to micro-optimizations that removing call instructions results in a substantial performance advantage, then you might consider rewriting the performance-critical sections of code in a different language that provides more extensive facilities for controlling that behavior. Based on the wording of your question, it appears that you are trying to force C# into a problem space which it was designed to avoid altogether.

Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
  • I guess this is what I have to deal with then. – Leopold Asperger May 24 '14 at 22:20
  • "removing call instructions results in a substantial performance advantage" Inlining is much more than "removing call instructions". An inlined method when compiled in the context of the calling method usually gets much simpler. – kaalus Jan 08 '20 at 23:32
7

Use the MethodImplAttribute attribute:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsControl(char c)
{
    return ((c >= 0 && c <= 31) || (c >= 127 && c <= 159));
}

See

http://msdn.microsoft.com/en-us/library/system.runtime.compilerservices.methodimplattribute.aspx

and

http://blogs.microsoft.co.il/sasha/2012/01/20/aggressive-inlining-in-the-clr-45-jit/

Rico Suter
  • 11,548
  • 6
  • 67
  • 93
4

.Net's jitter has builtin heuristics that help it determine whether To Inline or not to Inline. As I could not find a good reason (see below) that prevents inlining, and in 4.5 could persuade it via AggressiveInlining, so the jitter can inline if it wants to, that could be it. A quote:

  1. If inlining makes code smaller then the call it replaces, it is ALWAYS good. Note that we are talking about the NATIVE code size, not the IL code size (which can be quite different).

  2. The more a particular call site is executed, the more it will benefit from inlning. Thus code in loops deserves to be inlined more than code that is not in loops.

  3. If inlining exposes important optimizations, then inlining is more desirable. In particular methods with value types arguments benefit more than normal because of optimizations like this and thus having a bias to inline these methods is good.

Thus the heuristic the X86 JIT compiler uses is, given an inline candidate.

  1. Estimate the size of the call site if the method were not inlined.

  2. Estimate the size of the call site if it were inlined (this is an estimate based on the IL, we employ a simple state machine (Markov Model), created using lots of real data to form this estimator logic)

  3. Compute a multiplier. By default it is 1

  4. Increase the multiplier if the code is in a loop (the current heuristic bumps it to 5 in a loop)

  5. Increase the multiplier if it looks like struct optimizations will kick in.

  6. If InlineSize <= NonInlineSize * Multiplier do the inlining.


What follows is a description of my attempts to get to the bottom of this, it might help others in a similar situation.

I can reproduce it here on .Net 4.5 (both x68 and x64), but I have no idea why it does not get inlined, because it has none of the inlining show stoppers like being a virtual method or consuming more than 32 bytes. It's 30 bytes short:

.method public hidebysig static bool  IsControl(char c) cil managed
{
  // code size       30 (0x1e)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldc.i4.0
  IL_0002:  blt.s      IL_0009
  IL_0004:  ldarg.0
  IL_0005:  ldc.i4.s   31
  IL_0007:  ble.s      IL_001c
  IL_0009:  ldarg.0
  IL_000a:  ldc.i4.s   127
  IL_000c:  blt.s      IL_001a
  IL_000e:  ldarg.0
  IL_000f:  ldc.i4     0x9f
  IL_0014:  cgt
  IL_0016:  ldc.i4.0
  IL_0017:  ceq
  IL_0019:  ret
  IL_001a:  ldc.i4.0
  IL_001b:  ret
  IL_001c:  ldc.i4.1
  IL_001d:  ret
} // end of method Program::IsControl

When enabling AggressiveInlining (which you say you cannot, as you are on .Net 3.5), not only does the call get inlined, but the inlined code gets elided completely - as it should, because you don't use the return value:

--- Program.cs --------------------------------------------
        IsControl('\0');
00000000  ret 

N.B. I'm not sure if you are aware that in addition to using the Release build mode, you have to

  • Go to Tools => Options => Debugging => General and make sure that box labeled ‘Suppress JIT optimization on module load’ is Unchecked.
  • Make sure that the box labeled ‘Enable Just My Code’ is Unchecked.

in order to see JIT optimized code. If you don't, you will get the following instead of the above single ret statement:

--- Program.cs --------------------------------------------
        IsControl('\0');
00000000  push        rbp 
00000001  sub         rsp,30h 
00000005  lea         rbp,[rsp+30h] 
0000000a  mov         qword ptr [rbp+10h],rcx 
0000000e  mov         rax,7FF7F43335E0h 
00000018  cmp         dword ptr [rax],0 
0000001b  je          0000000000000022 
0000001d  call        000000005FAB06C4 
00000022  xor         ecx,ecx 
00000024  call        FFFFFFFFFFFFD3D0 
00000029  and         eax,0FFh 
0000002e  mov         dword ptr [rbp-4],eax 
00000031  nop 
    }
00000032  nop 
00000033  lea         rsp,[rbp] 
00000037  pop         rbp 
00000038  ret 

The following, shorter (and not equivalent) method btw will get inlined even without AggressiveInlining:

public static bool IsControl(char c)
{
    return c <= 31 || c >= 127;
}
Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156
  • 1
    You can run it under COMPLUS and see why compiler decided to not inline it. See https://github.com/dotnet/coreclr/blob/master/Documentation/building/viewing-jit-dumps.md – Alex Zhukovskiy Nov 08 '17 at 16:54
3

As you are into micro-optimizations, your IsControl method should look like either one of the following, depending on the (expected) actual distribution of the c values:

public static bool IsControl2(char c)
{
    return c <= 31 || (c >= 127 && c <= 159);
}
public static bool IsControl3(char c)
{
    return c <= 159 && (c <= 31 || c >= 127);
}

It will remove the superfluous check for c >= 0 (the minimum value of char is 0), reducing the number of comparisons to 3 in the worst case (though I haven't checked if the jitter is smart enough to elide the redundant check), and it also reduces the code size of the method from 30 to 26 bytes, which might influence the decision by the jitter whether to inline or not.

Evgeniy Berezovsky
  • 18,571
  • 13
  • 82
  • 156
2

There is a checkbox in the Visual Studio 2017 (and earlier) Debugger options called "Suppress JIT optimization on module load (Managed Only)". If this option is checked, you will not get any method inlining when running in the debugger, regardless of Release or Debug build or the [MethodImpl(MethodImplOptions.AggressiveInlining)] attribute.


enter image description here

Glenn Slayden
  • 17,543
  • 3
  • 114
  • 108