0

I'm trying to detect during runtime if the source code of a method of a class has been changed. Basically I retrieve the method body (IL), hash it with md5 and store it in the database. Next time I check the method, I can compare the hashes.

public class Changed
{
    public string SomeValue { get; set; }

    public string GetSomeValue()
    {
        return SomeValue + "add something";
    }

    public async Task<string> GetSomeValueAsync()
    {
        return await Task.FromResult(SomeValue + "add something");
    }
}

I'm using Mono.Cecil to retrieve the method bodies:

var module = ModuleDefinition.ReadModule("MethodBodyChangeDetector.exe");

var typeDefinition = module.Types.First(t => t.FullName == typeof(Changed).FullName);

// Retrieve all method bodies (IL instructions as string)
var methodInstructions = typeDefinition.Methods
    .Where(m => m.HasBody)
    .SelectMany(x => x.Body.Instructions)
    .Select(i => i.ToString());

var hash = Md5(string.Join("", methodInstructions));

This works great, except for methods marked as async. Whenever I add some code to the SomeValue method, the hash changes. Whenever I add some code to GetSomeValueAsync method, the hash does not change. Does anyone know how to detect if the method body of an async method has changed?

Wormbo
  • 4,978
  • 2
  • 21
  • 41
Robin van der Knaap
  • 4,060
  • 2
  • 33
  • 48
  • I suggest you look at the app with IlSpy, disabling the Display->Options->Decompile async methods. There is a quite big mojo done on `async` methods (like with `yield` methods) – xanatos Mar 21 '15 at 07:58
  • 1
    Async methods are something different. Compiler will transform it into a state machine. Refer [this](http://tomasp.net/blog/async-compilation-internals.aspx/) for more info. Also little bit of googling will help to understand how async methods work internally. – Sriram Sakthivel Mar 21 '15 at 08:01
  • You don't really need Cecil for something so simple... You can use `GetMethodBody().GetILAsByteArray()`/`GetMethodBody().LocalVariables` – xanatos Mar 21 '15 at 08:31
  • @xanatos That will only give the local variables, I want to know if the method body has changed, for example, if lines of code were added or altered. – Robin van der Knaap Mar 21 '15 at 08:35
  • 1
    The IL as byte array, only gives the opcodes, which does not include strings for example, used in the the method – Robin van der Knaap Mar 21 '15 at 08:36
  • @RobinvanderKnaap Yep... You are right, because string are loaded with a `ldstr` from a special table – xanatos Mar 21 '15 at 08:55

3 Answers3

5

I've found a solution, thanks to @xanatos and @Wormbo who put me in the right direction.

In the case of an async method, the C# compiler generates a helper class which contain the method body. These helper classes can be found in the NestedTypes property of the main type. So, if we include the method bodies of the nested types, we can create the correct hash:

var module = ModuleDefinition.ReadModule("MethodBodyChangeDetector.exe");

var typeDefinition = module.Types.First(t => t.FullName == typeof(Changed).FullName);

// Retrieve all method bodies (IL instructions as string)
var methodInstructions = typeDefinition.Methods
    .Where(m => m.HasBody)
    .SelectMany(x => x.Body.Instructions)
    .Select(i => i.ToString());

var nestedMethodInstructions = typeDefinition.NestedTypes
    .SelectMany(x=>x.Methods)
    .Where(m => m.HasBody)
    .SelectMany(x => x.Body.Instructions)
    .Select(i => i.ToString());


Md5(string.Join("", methodInstructions) + string.Join("", nestedMethodInstructions));
Robin van der Knaap
  • 4,060
  • 2
  • 33
  • 48
  • 1
    I've just found (inside compiled Razor Views) that some methods can be split into multiple levels, so searching only for first-level NestedTypes doesn't work. This worked for me for searching ALL nested types: var typesAndSubTypes = new List() { typeDefinition }; for(int i = 0; i x.Methods) .Where(m => m.HasBody) .SelectMany(x => x.Body.Instructions).ToList(); – drizin Sep 23 '20 at 17:04
4

Async methods, like iterator methods, are mostly compiled into a nested helper class that represents a state machine. That entire helper class (use ILSpy with deactivated option to decompile async methods to see the result for your example) will be used only for that async method. Changes to the method will likely happen in a generated method of that helper class instead of the original method.

Wormbo
  • 4,978
  • 2
  • 21
  • 41
  • Yes, I've read that as well. Do you know a way to get that generated method during runtime? Or do you suggest I check the source code of ILSpy and see how they do it? – Robin van der Knaap Mar 21 '15 at 08:09
  • At least the Microsoft C# compiler seems to include the method name in the name of the generated class and that generated class implements IAsyncStateMachine. – Wormbo Mar 21 '15 at 08:13
2

For your second question, without using Cecil (because I don't have it):

var method2 = typeof(Program).GetMethod("MyMethodX", BindingFlags.Static | BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
var body = method2.GetMethodBody();
Type[] compilerGeneratedVariables = body.LocalVariables.Select(x => x.LocalType).Where(x => x.GetCustomAttributes(typeof(CompilerGeneratedAttribute), false).Length != 0).ToArray();
byte[] ilInstructions = body.GetILAsByteArray(); // You can hash these

if (compilerGeneratedVariables.Length != 0)
{
    // There are some local variables of types that are compiler generated
    // This is a good sign that the compiler has changed the code
}

If you look at the generated code, you'll see that clearly it needs a local variable of the "hidden" type that has been generated by the compiler. We use this :-) Note that this is compatible with both yield and async

xanatos
  • 109,618
  • 12
  • 197
  • 280