1

Problem

I'm using Mono.Cecil to IL Weave string property getters that have my custom [ValidSystemPath] attribute on them. The purpose of the attribute is to ensure the property only ever returns valid system characters for file names and paths etc. Problem is, the code is not currently working, yet is not raising any exceptions during weaving. I'm new to weaving and IL, so I'd benefit from a guiding-hand.

Code before weaving (C#)

private string path = "test|.txt";
[ValidSystemPath]
public string Path => path;

Expected code after weaving (C#)

This is effectively my source of inspiration for the code I'm trying to weave in... https://stackoverflow.com/a/23182807/1995360

public string Path {
    get {
        string ReplaceIllegal(string p)
        {
            char[] invalid = Path.GetInvalidFileNameChars().Concat(Path.GetInvalidPathChars()).ToArray();
            return string.Join("_", p.Split(invalid));
        }
        
        return ReplaceIllegal(path);
    }
}

I'd like to use a nested method, because if the getter contains conditionals, there could be multiple return statements, thus I need to make a simple call to the nested method before each return statement.

Weaver code (C#)

private static void ValidSystemPath(ModuleDefinition module, TypeDefinition type, PropertyDefinition property)
{
    // Getter method - site of injection
    MethodDefinition getter = property.GetMethod;
    ILProcessor getterProcessor = getter.Body.GetILProcessor();

    // Import the methods
    MethodReference joinMethod = module.ImportReference(typeof(string).GetMethod("Join", new Type[] { typeof(string), typeof(string[]) }));
    MethodReference splitMethod = module.ImportReference(typeof(string).GetMethod("Split", new Type[] { typeof(char[]) }));
    MethodReference getInvalidPathCharsMethod = module.ImportReference(typeof(Path).GetMethod("GetInvalidPathChars", new Type[] { }));
    MethodReference getInvalidFileNameCharsMethod = module.ImportReference(typeof(Path).GetMethod("GetInvalidFileNameChars", new Type[] { }));
    MethodReference concatMethod = module.ImportReference(typeof(Enumerable).GetMethod("Concat"));
    MethodReference toArrayMethod = module.ImportReference(typeof(Enumerable).GetMethod("ToArray"));
    //MethodReference toArrayMethod = module.ImportReference(typeof(Enumerable).GetMethodExt("ToArray", new Type[] { typeof(IEnumerable<char>) }));

    // Create new nested method in getter
    MethodDefinition nested = new(
        $"<{getter.Name}>g__ReplaceIllegalChars|2_0",
        Mono.Cecil.MethodAttributes.Assembly | Mono.Cecil.MethodAttributes.HideBySig | Mono.Cecil.MethodAttributes.Static,
        module.TypeSystem.String
    );
    type.Methods.Add(nested);

    // Write instructions for method
    ILProcessor nestedProcessor = nested.Body.GetILProcessor();
    nestedProcessor.Emit(OpCodes.Nop);
    nestedProcessor.Emit(OpCodes.Call, getInvalidFileNameCharsMethod);
    nestedProcessor.Emit(OpCodes.Call, getInvalidPathCharsMethod);
    nestedProcessor.Emit(OpCodes.Call, concatMethod);
    nestedProcessor.Emit(OpCodes.Call, toArrayMethod);
    nestedProcessor.Emit(OpCodes.Stloc_0); // Return value is top stack
    nestedProcessor.Emit(OpCodes.Ldstr, "_");
    nestedProcessor.Emit(OpCodes.Ldarg_0);
    nestedProcessor.Emit(OpCodes.Ldloc_0);
    nestedProcessor.Emit(OpCodes.Callvirt, splitMethod); // Non static
    nestedProcessor.Emit(OpCodes.Call, joinMethod);
    nestedProcessor.Emit(OpCodes.Stloc_1);
    nestedProcessor.Emit(OpCodes.Ldloc_1);

    //getterProcessor.Body.SimplifyMacros();
    // Add nested call before each return
    IEnumerable<Instruction> returnInstructions = getterProcessor.Body.Instructions.Where(instruction => instruction.OpCode == OpCodes.Ret);
    returnInstructions.ToList().ForEach(ret => getterProcessor.InsertBefore(ret, Instruction.Create(OpCodes.Call, nested)));
    /*foreach (Instruction ret in returnInstructions)
    {
        // Call nested method and return that value
        getterProcessor.InsertBefore(ret, Instruction.Create(OpCodes.Call, nested));
    }*/
    //getterProcessor.Body.OptimizeMacros();
}

Breakdown of weaver

  1. Find the site of injection and create an ILProcessor.
  2. Import method references for all the method calls we'll be making (string Split/Join, Path GetInvalidPathChars/GetInvalidFileNameChars, and Enumerable Concat/ToArray).
  3. Create the nested method and add.
  4. Emit the method body.
  5. Add nested method calls before each return statement. (there's a lot of code commented out here, as I was testing the best way to insert each method call. I've also tried using SimplifyMacros() and OptimizeMacros() but was unsure what they did so commented out).

Expected / Actual runtime output

"test_.txt" / "test|.txt"

Thank you for any help you can provide me in getting this code working.

Dom
  • 213
  • 3
  • 10
  • 1
    Waving will allow to generate code that will be "invalid". Check what your code generates in ILSpy/dnSpy and fix the errors in IL. Few "obvious" problems: missing ret, missing declaration for argument and local variables. – Paweł Łukasik May 06 '23 at 13:13
  • @PawełŁukasik thank you for your recommendations and for pointing out what I have missed. Looking back now, the missing ret seems obvious! But I am still new to IL so forgive me a little haha. I'll try in dnSpy/ILSpy and report back. – Dom May 06 '23 at 13:52

1 Answers1

2

As I've mentioned in the comment, it's not required to have valid IL to be able to save the file. This is sometimes used by some obfuscators and IL is only fixed before the method get executed. If this is not what you want, you need to be sure that what you are doing is producing correct IL, that will be correctly executed by the runtime.

The best approach (IMO) is to write the code you want to generate and see the generate IL in ILSpy/dnSpy.

Problems with the your code are:

Missing ret statement.

Just add nestedProcessor.Emit(OpCodes.Ret); at the end of the generated ILs.

Using arguments

In line nestedProcessor.Emit(OpCodes.Ldarg_0); you are loading the argument 0, but there's no arguments defined. Add nested.Parameters.Add(new ParameterDefinition(module.TypeSystem.String)); to indicate that this function takes one argument of type string.

Using locals

In lines nestedProcessor.Emit(OpCodes.Stloc_0); and nestedProcessor.Emit(OpCodes.Stloc_1); you are using local variables, but those are not defined too.

Add lines

nested.Body.Variables.Add(new VariableDefinition(module.ImportReference(typeof(char[])));
nested.Body.Variables.Add(new VariableDefinition(module.TypeSystem.String));

to indicate that this method has 2 local variables, first of type char[] and second of type string.

Generics

In your code, you use generics (Concat and ToArray calls) and those need to be specialized before being used in IL. Call MakeGenericMethod providing the generic type, to correctly specialized them before use.

var concat = typeof(Enumerable).GetMethod("Concat");
var conact_spec = concat.MakeGenericMethod(typeof(char));
MethodReference concatMethod = module.ImportReference(conact_spec);

var toArray = typeof(Enumerable).GetMethod("ToArray");
var toArray_spec = toArray.MakeGenericMethod(typeof(char));
MethodReference toArrayMethod = module.ImportReference(toArray_spec);
Paweł Łukasik
  • 3,893
  • 1
  • 24
  • 36
  • 1
    Thanks a lot for your help. I was able to get my nested method working thanks to your very helpful feedback. I'm new to Mono.Cecil and IL code, so it was really useful to see what I was missing :) The missing return was an oversight, and I didn't know I had to define locals and arguments, or specialize generics, but it all makes sense now! – Dom May 11 '23 at 07:51