2

.NET apps are distributed in files called assemblies, which contain metadata and common intermediate language (CIL) code. The standard that .NET conforms to, ECMA-335, II.3, notes a distinction between two similar-sounding terms:

  • An assembly is valid if it conforms to the standard.

    Validation refers to the application of a set of tests on any file to check that the file’s format, metadata, and CIL are self-consistent. These tests are intended to ensure that the file conforms to the normative requirements of this specification.

  • An assembly is verifiable if the assembly is valid and it can be proven, via a static analysis algorithm described by the standard, that the assembly is type-safe.

    Verification refers to the checking of both CIL and its related metadata to ensure that the CIL code sequences do not permit any access to memory outside the program’s logical address space. In conjunction with the validation tests, verification ensures that the program cannot access memory or other resources to which it is not granted access.

All verifiable assemblies are valid, but not all valid assemblies are verifiable. Additionally, some valid assemblies may in practice be type-safe, but the verification algorithm cannot prove them as such, so they are not verifiable. To use a diagram from the standard:

ECMA-335, II.3, Figure 1: Relationship between correct and verifiable CIL

The .NET SDK provides a tool to statically determine if an assembly is verifiable: PEVerify. Because verifiable assemblies must also be valid, this tool also reports errors if the assembly is not valid.

However, there doesn't seem to be an equivalent tool or procedure for determining if an assembly is just valid. For instance, if I already know the assembly is unverifiable, and I'm OK with that, how do I still ensure that the runtime won't error due to an invalid program?

My test case:

.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )
  .ver 4:0:0:0
}

.assembly MyAsm { }
.module MyAsm.exe
.corflags 0x00020003    //  ILONLY 32BITPREFERRED

.class public Program
{ 
  .method public static int32 EntryPoint(string[] args) cil managed
  {
    .maxstack 2
    .entrypoint

    call string [MyAsm]Program::normal()
    call void [mscorlib]System.Console::WriteLine(string)

    call string [MyAsm]Program::unverifiable_init()
    call void [mscorlib]System.Console::WriteLine(string)

    call string [MyAsm]Program::unverifiable_jmp()
    call void [mscorlib]System.Console::WriteLine(string)

    call string [MyAsm]Program::invalid()
    call void [mscorlib]System.Console::WriteLine(string)

    ldc.i4.0
    ret
  }

  .method public static string normal() cil managed
  {
    .maxstack 2
    .locals init ([0] int32 initialized)

    ldstr  "From normal: "
    ldloca initialized
    call instance string [mscorlib]System.Int32::ToString()
    call string [mscorlib]System.String::Concat(string, string)

    ret
  }

  .method public static string unverifiable_jmp() cil managed
  {
    .maxstack 1

    ldstr "Printing from unverifiable_jmp!"
    call void [mscorlib]System.Console::WriteLine(string)

    jmp string [MyAsm]Program::normal() // jmp is always unverifiable
  }

  .method public static string unverifiable_init() cil managed
  {
    .maxstack 2
    .locals ([0] int32 hasGarbage) // uninitialized locals are unverifiable

    ldstr  "From unverifiable_init: "
    ldloca hasGarbage
    call instance string [mscorlib]System.Int32::ToString()
    call string [mscorlib]System.String::Concat(string, string)

    ret
  }

  .method public static string invalid() cil managed
  {
    .maxstack 1

    ldstr "Printing from invalid!"
    call void [mscorlib]System.Console::WriteLine(string)

    ldstr "From invalid"
    // method fall-through (no ret) is invalid
  }
}

I assemble this using ilasm, producing MyAsm.exe.

While I can run the assembly, the .NET runtime will only error when the invalid() method is called, not when the assembly is loaded. If I remove the call, then the program runs to completion with no errors, so just loading and running an assembly doesn't guarantee that it is fully valid.

Running PEVerify on the assembly produces three errors. While to the human eye it's reasonably easy to see, in this case, that the first two errors are verification errors, and the last one is a validation error, it doesn't look like there's an easy way to automate that differentiation (e.g., checking each line for verifi seems too broad).

Microsoft (R) .NET Framework PE Verifier.  Version  4.0.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

[IL]: Error: [C:\...\MyAsm.exe : Program::unverifiable_jmp][offset 0x0000000A] Instruction cannot be verified.
[IL]: Error: [C:\...\MyAsm.exe : Program::unverifiable_init][offset 0x00000005] initlocals must be set for verifiable methods with one or more local variables.
[IL]: Error: [C:\...\MyAsm.exe : Program::invalid][offset 0x0000000A] fall through end of the method without returning
3 Error(s) Verifying MyAsm.exe
Joe Sewell
  • 6,067
  • 1
  • 21
  • 34
  • Bad thought number 1 - you *could* abuse `ngen` here, I think. Since it has to compile all methods, it should generate some kind of error for the invalid method. I can't think of anything else that's built in that could e.g. force all methods of an assembly to be JITted. – Damien_The_Unbeliever Dec 29 '17 at 07:02
  • @Damien_The_Unbeliever I tried this and it seems to work. `ngen install MyAsm.exe` does report an error just for validity, not verifiability (`Common Language Runtime detected an invalid program. while compiling method Program.invalid`). Strangely, this still exits with code 0, and I can later use `ngen display MyAsm.exe` (though this call then exits with -1) and I have to use `ngen uninstall MyAsm.exe` before calling `install` again, so I guess it's installed despite AOT compilation errors? Anyway, I think you can submit this as an answer. – Joe Sewell Dec 29 '17 at 17:28
  • I didn't submit it as an answer for a reason - it's seriously abusing the infrastructure to get a result. thehennyy seems to have provided some kind of answer that looks like it would be less "polluting" to use. In general, it seems you will need to provoke compilation somewhere to get validity errors and there isn't the tool you seek (that only validates, doesn't verify) – Damien_The_Unbeliever Dec 29 '17 at 17:34
  • Possible duplicate of [Can PEVerify tell me the severity of each error?](https://stackoverflow.com/questions/31908847/can-peverify-tell-me-the-severity-of-each-error) – Joe Sewell Dec 29 '17 at 18:38

1 Answers1

5

Based on @Damien_The_Unbelievers comment I wrote this small snippet that uses the RuntimeHelpers.PrepareMethod Method to compile each method. It will not handle all cases (nested types, generics, reference resolution, ...) but as a starting point it works:

var b = File.ReadAllBytes("MyAsm.exe");
var asm = Assembly.Load(b);

foreach(var m in asm.GetModules())
{
    foreach(var t in m.GetTypes())
    {
        foreach(var mb in t.GetMethods((BindingFlags)62).Cast<MethodBase>().Union(t.GetConstructors((BindingFlags)62)))
        {
            try
            {
                RuntimeHelpers.PrepareMethod(mb.MethodHandle);
            }
            catch (InvalidProgramException ex)
            {
                Console.WriteLine($"{mb.DeclaringType}::{mb.Name} - {ex.Message}");
            }

        }
    }
}

will output:

Program::invalid - Common Language Runtime detected an invalid program.

thehennyy
  • 4,020
  • 1
  • 22
  • 31
  • Thanks. Two notes if anyone tries to reproduce this: (1) `(BindingFlags) 62` is the combination of the `DeclaredOnly`, `Instance`, `Static`, `Public`, and `NonPublic` flags. (2) The `PrepareMethod` call requires the assembly to be in the same directory as this program's assembly. – Joe Sewell Dec 29 '17 at 18:32
  • I wonder, how hard would it be to make it handle generic methods too? – IS4 Jan 03 '18 at 01:48
  • @IllidanS4 Two options are in my mind: Use a ILReader to get and later process all occuring instantiations or just use some "random" types that at least fulfil the constraints if there are any. Since the validity of a generic method should be not depend on the generic parameter, the later approach could be sufficient. – thehennyy Jan 03 '18 at 07:18