Why does localloc break this CIL method?

Question

I have the following piece of reduced CIL code.
When this CIL method is executed, an InvalidProgramException is being thrown by the CLR:

  .method assembly hidebysig specialname rtspecialname 
          instance void  .ctor(class [mscorlib]System.Collections.Generic.IEnumerable`1<class System.Windows.Input.StylusDeviceBase> styluses) cil managed
  {
    .locals init (class [mscorlib]System.Collections.Generic.IEnumerator`1<class System.Windows.Input.StylusDeviceBase> V_0,
         class System.Windows.Input.StylusDeviceBase V_1)

    ldc.i4.8   // These instructions cause CIL to break 
    conv.u     //
    localloc   //
    pop        //

    ldarg.0
    newobj instance void class [mscorlib]System.Collections.Generic.List`1<class System.Windows.Input.StylusDevice>::.ctor()
    call   instance void class [mscorlib]System.Collections.ObjectModel.ReadOnlyCollection`1<class System.Windows.Input.StylusDevice>::.ctor(class [mscorlib]System.Collections.Generic.IList`1<!0>)
    ldarg.1
    callvirt instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<class System.Windows.Input.StylusDeviceBase>::GetEnumerator()
    stloc.0

    .try
    {
       leave.s IL_0040
    }
    finally
    {
       endfinally
    }   

    IL_0040: ret
  } // end of method StylusDeviceCollection::.ctor

My question is, why is this CIL code invalid?

Several obervations:
- If localloc is removed, the code runs fine. To my knowledge, localloc replaces the parameter size on the stack with an address, so the stack remains balanced, AFAICT.
- If the try and finally blocks are removed, the code runs fine.
- If the first block of instructions containing localloc is moved to after the try-finally block, the code runs fine.

So it seems like something in the combination of localloc and the try-finally.

Some background:

I got to this point after the InvalidProgramException was thrown for the original method, due to some instrumentation made in runtime. My approach for debugging this, for figuring out what is wrong with the instrumentation, is:

Disassembling the faulty DLL with ildasm
Applying the instrumentation code to the crashing method
Recreating the DLL from the modified IL with ilasm
Running the program again, and verifying it crashes
Keep reducing the IL code of the crashing method gradualy, down to the minimal scenario that causes the problem (and trying not to introduce bugs along the way...)

Unfortunately, peverify.exe /IL did not indicate any error. I tried to console the ECMA spec and Serge Lidin's Expert .NET IL book, but couldn't figure out what is it that goes wrong.

Is there something basic I am missing?

Edit:

I slightly updated the IL code in question, to make it more complete (without modifying instructions). The second block of instructions, including ldarg, newobj, etc., is taken as is from the working code - the original method code.

What's weird to me is, by removing either localloc or .try-finally, the code works - but none of these, to my knowledge, should change the balancing of the stack, compared to if they're present in the code.

Here's the IL code decompiled into C# with ILSpy:

internal unsafe StylusDeviceCollection(IEnumerable<StylusDeviceBase> styluses)
{
    IntPtr arg_04_0 = stackalloc byte[(UIntPtr)8];
    base..ctor(new List<StylusDevice>());
    IEnumerator<StylusDeviceBase> enumerator = styluses.GetEnumerator();
    try
    {
    }
    finally
    {
    }
}

Edit 2:

More observations:
- Taking the localloc block of IL code, and moving it to the end of the function, code runs fine - so it seems that code on its own is OK.
- The issue does not reproduce when pasting similar IL code into a hello world test function.

I'm very puzzled...

I wish there was a way to get more information from the InvalidProgramException. It seems that the CLR doesn't attach the exact failure reason to the exception object. I also thought on debugging with CoreCLR debug build, but unforunately the program I'm debugging is not compatible with it...

When you open your compiled DLL in ILSpy, does it show the code you expect to see? (Probably only in IL view - this probably won't decompile into C#, although I may be wrong.) — xxbbcc, Oct 25 '17 at 19:26
Sadly, I cannot reproduce this. Is `ldarg.0` to `call` necessary? — IS4, Oct 25 '17 at 19:42
Pretty unclear how peverify.exe could be used, it doesn't like localloc. That ldarg.0 imbalances the stack, you are calling the default constructor of List<>. Just delete it. — Hans Passant, Oct 25 '17 at 19:57
@xxbbcc interesting, I didn't think of decompiling back that IL into C# - nice! It actually decompiles, I'll edit the question with the decompiled code. — valiano, Oct 26 '17 at 05:39
@IllidanS4 Yes, it seems `ldarg.0` is necessary - that's taken directly from the original code. For completeness, I added also the following function call, and the 2nd local in `.locals`, to make the function code identical to the that of the original function. — valiano, Oct 26 '17 at 05:43
@HansPassant point taken about `peverify.exe`, thanks! I believe `ldarg.0` is nessesary, as it taken from the original and working code - I slightly updated the code in question (added a few more instructions and another local from the original function), hopefully to give a slightly clearer picture. If helpful, I could post the C#/IL code of involved functions. — valiano, Oct 26 '17 at 05:54
@valiano Is `localloc` injected (by re-writing the IL stream) or is it added manually during an ildasm->ilasm cycle? — xxbbcc, Oct 26 '17 at 05:56
@xxbbcc usually, that `localloc` is added by IL rewriting in runtime, along with a bunch of other instructions. For ease of debugging, I'm adding it manually to `ildasm` output and running again through `ilasm`. — valiano, Oct 26 '17 at 06:00
@IllidanS4 @HansPassant `peverify` indicates that `ldarg.0` is for loading the `this` pointer which is consumed by the try block (if I'm getting this right) — valiano, Oct 26 '17 at 11:37

valiano · Accepted Answer · 2018-06-01T05:24:30.593

Sadly, it seems I hit a CLR bug...

Everything is working when using the legacy JIT Compiler:

set COMPLUS_useLegacyJit=1

I wasn't able to isolate a specific RyuJit setting which may be causing this. I followed the recommendation in this article:
https://github.com/Microsoft/dotnet/blob/master/Documentation/testing-with-ryujit.md

Thanks to everyone who helped!

Aftermath:

Sometime after I came across the legacy JIT workaround, I realized that the issue only manifests when instrumenting localloc (which is a non verifiable opcode) into a security critical method called from a security transparent method. Only in this scenario, RyuJit would throw an InvalidProgramException, while Legacy JIT won't.

In my reproduction, I disassembled and reassembled the DLL in question and modified the function code directly, keeping security attributes intact - specifically the AllowPartiallyTrustedCallers assembly attribute - which explains why the issue wasn't reproduced with an isolated example.

It might be that in RyuJIT there's some security hardening compared to Legacy JIT which surfaces this issue, but still, the fact that localloc will cause the CLR to throw an InvalidProgramException depdending in the presence of try-catch and its relative location to the localloc, does seem like a subtle bug.

Running SecAnnotate.exe (.NET Security Annotator tool) on the failing DLL was helpful in revealing the security issues between function calls.

More on Security-Transparent Code:
https://learn.microsoft.com/en-us/dotnet/framework/misc/security-transparent-code

You may want to open an issue for the CoreCLR repository (which contains RyuJIT); there are [several](https://github.com/dotnet/coreclr/search?type=Issues) for `localloc` already. `localloc` isn't allowed in a `finally`, so it's possible the jitter is getting confused and disallows it entirely here, even though it's not in the block. — Jeroen Mostert, Oct 26 '17 at 12:06
@JeroenMostert thanks! I suspected this may be the case... I'm running on .NET 4.6, so will check whether this reproduces on the latest CoreCLR. I noticed a similar bug that was already fixed, so chances are, this may be already fixed as well: https://github.com/dotnet/coreclr/blob/master/tests/src/JIT/Regression/VS-ia64-JIT/V1.2-M01/b10841/repro_good.il — valiano, Oct 26 '17 at 12:44
I couldn't repro this and couldn't quite follow all the description about exactly what was required to hit the issue you saw. If you put together a repro case I will take a more in-depth look. — Andy Ayers, Jun 07 '18 at 02:57
@AndyAyers Thanks! It's been a while but I'll try to put together a repro case — valiano, Jun 07 '18 at 03:27

Why does localloc break this CIL method?

1 Answers1