4

I have a compiler that builds and runs correctly, but PEVerify is calling it unverifiable at a certain point. After looking carefully at the error, the corresponding source code, and the ILDasm output for the point in question, I can't find the problem, to the point where I would suspect a bug in PEVerify, except that the .NET and Mono versions report the same error in the same place.

The problematic method reads as follows:

internal static bool InAsyncMethod(Expression value)
{
    INodeWithBody ancestor = value.GetAncestor<BlockExpression>() ?? (INodeWithBody) value.GetAncestor<Method>();
    return ContextAnnotations.IsAsync(ancestor);
}

The error is reported as:

[IL]: Error: [D:\SDL-1.3.0-4423\boo\build\Boo.Lang.Compiler.dll : Boo.Lang.Compiler.TypeSystem.AsyncHelper::InAsyncMethod][offset 0x00000011][found ref 'Boo.Lang.Compiler.Ast.Node'][expected ref Boo.Lang.Compiler.Ast.INodeWithBody'] Unexpected type on the stack.

Offsest 0x11 corresponds to the second half of the ?? expression. From ILDasm:

.method assembly hidebysig static bool  InAsyncMethod(class Boo.Lang.Compiler.Ast.Expression 'value') cil managed
{
  // Code size       29 (0x1d)
  .maxstack  2
  .locals init ([0] class Boo.Lang.Compiler.Ast.INodeWithBody ancestor,
           [1] bool CS$1$0000)
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  callvirt   instance !!0 Boo.Lang.Compiler.Ast.Node::GetAncestor<class Boo.Lang.Compiler.Ast.BlockExpression>()
  IL_0007:  dup
  IL_0008:  brtrue.s   IL_0011
  IL_000a:  pop
  IL_000b:  ldarg.0
  IL_000c:  callvirt   instance !!0 Boo.Lang.Compiler.Ast.Node::GetAncestor<class Boo.Lang.Compiler.Ast.Method>()
  IL_0011:  stloc.0
  IL_0012:  ldloc.0
  IL_0013:  call       bool Boo.Lang.Compiler.Steps.ContextAnnotations::IsAsync(class Boo.Lang.Compiler.Ast.INodeWithBody)
  IL_0018:  stloc.1
  IL_0019:  br.s       IL_001b
  IL_001b:  ldloc.1
  IL_001c:  ret
} // end of method AsyncHelper::InAsyncMethod

The Boo.Lang.Compiler.Ast.Node class is the base class for all AST nodes. BlockExpression and Method are the node classes for lambdas and methods, respectively, both of which implement the INodeWithBody interface. Everything appears correct, both in the C# (which would not build if there were type problems) and in the IL (where it says at 000c that the return type of the GetAncestor<Method> call is !!0, the first type parameter on the method call.)

What is causing PEVerify to think it's dealing with a value of type Node here when it clearly has a value of type Method? And is there any way to fix it?

svick
  • 236,525
  • 50
  • 385
  • 514
Mason Wheeler
  • 82,511
  • 50
  • 270
  • 477
  • 1
    Having debugged a number of these myself over the years: it's almost always involving some version of the `? :` conditional operator. The verifier has changed its strategy for dealing with conditionals over the years, and it can get quite confusing. – Eric Lippert May 21 '17 at 13:36
  • @EricLippert This time it was the `??` operator. (See the accepted answer for details.) Rewriting it as two separate statements fixed it, which is a bit odd as the code ought to be exactly equivalent. I'm just surprised to see `csc` generating unverifiable IL! – Mason Wheeler May 21 '17 at 13:46
  • 1
    The `??` operator is just the `?:` operator in fancy dress. (Similarly `&&` and `||` and `?.`) If `csc` is generating this then either (1) your version of csc was tested against a different version of the verifier; as I said, its algorithm has changed over the years, or (2) csc has yet another freakin' bug in a conditional code path; please report it, glad I don't have to fix it this time, or (3) both. – Eric Lippert May 21 '17 at 13:50
  • @EricLippert This was in VS2013's C# compiler. If I submit a bug report, will that do any good, or will the answer be "update to a newer version of Visual Studio"? :P – Mason Wheeler May 21 '17 at 14:07

2 Answers2

3

What is causing PEVerify to think it's dealing with a value of type Node here when it clearly has a value of type Method?

As pointed out by Stephane Delcroix as well, there are two code paths reaching to IL_0011, as it is the target of a branch. It clearly does not necessarily have a value of type Method.

GetAncestor<TAncestor> returns TAncestor. You've got either GetAncestor<BlockExpression>'s result, or GetAncestor<Method>'s result, so either BlockExpression or Method. Both implement INodeWithBody, so logically, the code is still fine.

Unfortunately, "either BlockExpression or Method" is too much for verification. This gets simplified to Node (the common base), which does not implement INodeWithBody. See ECMA-335 §III.1.8.1.3:

The merged type, U, shall be computed as follows (recall that S := T is the compatibility function defined in §III.1.8.1.2.2):

  1. if S := T then U=S

  2. Otherwise, if T := S then U=T

  3. Otherwise, if S and T are both object types, then let V be the closest common supertype of S and T then U=V.

  4. Otherwise, the merge shall fail.

If you check what the C# compiler does, you'll see that it emits a stloc.0/ldloc.0 combination into a local of type INodeWithBody prior to the dup. This makes everything work, because the common type of INodeWithBody and Method is then INodeWithBody.

Community
  • 1
  • 1
  • Oh wow, I totally missed that conditional jump! Unfortunately, moving the cast to the other side of the ??, or even putting it on both, creates the same unverifiable IL. Any idea how to fix this properly? Or should this be reported as a bug in `csc`? – Mason Wheeler May 21 '17 at 11:18
  • @MasonWheeler Wait -- you're getting this IL from csc? I thought you were getting this from your own compiler, sorry for misunderstanding. csc shouldn't be generating this, and in the last paragraph, I described the behaviour I saw from a C# compilation. You could try to install the Microsoft.Net.Compilers package to force a more up-to-date compiler to be used, and if that doesn't work, indeed, report it as a bug. –  May 21 '17 at 11:23
  • Yeah, this is coming out of the normal C# compiler in Visual Studio 2013. Mono's C# compiler apparently generates the same problematic output. – Mason Wheeler May 21 '17 at 11:37
1

It's unclear if it fails on 0x11 due to the first or the right part of the ?? as 0x08 is branching to there.

I'm leaning toward thinking GetAncestor<> returns a Node and the left part of the ?? is missing an explicit cast.

Stephane Delcroix
  • 16,134
  • 5
  • 57
  • 85
  • 1
    This is a reasonable guess, but it turns out `GetAncestor` returns `TAncestor`, and both `BlockExpression` and `Method` derive from `Node` and implement `INodeWithBody`. –  May 21 '17 at 11:11