2

I know that Static methods can be inlined by JIT optimization in .Net (and Mono)

My question is, can an instance method, that accesses its own state, be inlined too?

For example:

public class CaseSensitiveLiteralStringMatcher : IStringMatcher
{
    private readonly LiteralToken _token;

    public CaseSensitiveLiteralStringMatcher(LiteralToken token)
    {
        _token = token;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public bool IsMatch(char containsChar, int position)
    {
        return containsChar == _token.Value[position];       
    }

}

Would the above method call be inlined even though its not static and accesses some private member?

Darrell
  • 1,905
  • 23
  • 31

3 Answers3

3

I found a great read here regarding this: http://blogs.microsoft.co.il/sasha/2007/02/27/jit-optimizations-inlining-and-interface-method-dispatching-part-1-of-n/

My conclusion is, instance methods can be inlined, but virtual methods can't because the actual method called can change at runtime and can not be established using static analysis of the source code.

For this reason, the method I have shown in my question could be inlined, if it wasn't an interface method - as that means it is virtual in the sense it has to be dispatched via a vtable lookup at runtime.

Saying that, there are JIT optimization techniques that can optimise virtual method inlining for the "common" case, but these come with a fallback for when the inlined method doesn't match the desired method call at runtime, which means certain code paths may benefit more from the inlining that others.

Darrell
  • 1,905
  • 23
  • 31
  • The answer here seems to contradict with you in that it suggests method calls via an interface are dispatched via a vtable lookup - https://stackoverflow.com/questions/22005700/will-c-sharp-inline-methods-that-are-declared-in-interfaces which makes sense imho :-) – Darrell Mar 02 '18 at 01:16
2

Okay. I have results. The answer appears to be that it is possible for the JIT to inline a method that implements an interface and that accesses or modifies a class member.

My results are:

  • 10^7 runs of process1: 84 ms
  • 10^7 runs of process2 (via interface): 83 ms
  • 10^7 runs of inline loop without class or method call: 83 ms

i.e. identical performance with and without the interface. Also, performance remains the same without the compiler aggressive inlining directive.

Test code:

class Program
{
    internal interface IFastProcessor
    {
        void Process(int i);
    }

    internal sealed class FastProcessorImpl : IFastProcessor
    {
        private int number;

        public FastProcessorImpl(int number)
        {
            this.number = number;
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public void Process(int i)
        {
            number = ((number + i) / (number + i)) * number;
        }
    }

    internal sealed class FastProcessor
    {
        private int number;

        public FastProcessor(int number)
        {
            this.number = number;
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public void Process(int i)
        {
            number = ((number + i) / (number + i)) * number;
        }
    }

    static void Main(string[] args)
    {
        var sw1 = new Stopwatch();
        var processor1 = new FastProcessor(10);
        sw1.Start();
        for (int i = 1; i < 10000000; i++)
        {
            processor1.Process(i);
        }
        sw1.Stop();

        var sw2 = new Stopwatch();
        var processor2 = (IFastProcessor)new FastProcessorImpl(10);
        sw2.Start();
        for (int i = 1; i < 10000000; i++)
        {
            processor2.Process(i);
        }
        sw2.Stop();

        var number = 10;
        var sw3 = new Stopwatch();
        sw3.Start();
        for (int i = 1; i < 10000000; i++)
        {
            number = ((number + i) / (number + i)) * number;
        }
        sw3.Stop();

        Console.WriteLine($"Class: {sw1.ElapsedMilliseconds}ms, Interface: {sw2.ElapsedMilliseconds}ms, Inline: {sw3.ElapsedMilliseconds}ms");
    }
}

UPDATE: I also tried a base class with a virtual method. To my extreme surprise, this also performed identically to the inline version, meaning that perhaps the compiler was optimising away the virtual call, allowing the JIT to inline anyway. So I can't be certain on the interfaces vs. virtual methods question. But, on the other hand, it's safe to say that in the OPs question, I don't see a reason why the method wouldn't be inlined.

Adam Brown
  • 1,667
  • 7
  • 9
  • 1
    Adam, that is interesting, thank you. I notice that in your test code, its possible to tell from static analysis that `processor1` will always be `FastProcessor` and won't ever be any other concrete type, ever. In that case I assume inlining can occur. Same issue with processor2. I wonder if you called processor1 (as a FastProcessor) for 50% of the calls, and then set fast processor1 to a FastProcessorImpl and called it again for the remainder of the calls, would you notice a difference then? – Darrell Dec 22 '17 at 02:06
1

Interfaces allow us to design better code, but complicates the matter when the code needs to be optimized. The jitter (sometimes the compiler can do this too) has an arsenal of techniques that are used at runtime to try to see through our code and perform better. As of .NET Framework 5, these optimizations are done while the application is working (and they can be reapplied if the jitter detects poor performances). For a glimpse of what it can do, have a look at RyuJIT Tutorial.

Calls at interface methods are dispatched through a V-Table when speaking at high level. However, at low level, the call can go through or even being inlined when the jitter is able to infer that the call site satisfies certain constraints. This technique is called devirtualization.

Generally if the jit can determine the type of the this object at an interface call, it can devirtualize and then potentially inline. There are two main mechanisms for determining types:

deduce the type from flow analysis within a method
enable PGO, have that observe the possible types for this, and then test for the most likely type when rejitting or in a future run of the process.

Last I looked flow analysis can enable devirualization and inlining in a relatively small fraction of interface call cases (say no more than 10%). Success here requires that there be some "organic" evidence of type identity (constructor call or type test) upstream from the interface site. Inlining can help bring together the needed information but currently the inlining heuristics do not include increased devirtualization potential as part of their evaluation. That may change soon (see eg #53670).

PGO is quite effective at devirtualizing interface calls; most studies I have done show upwards of 80% of interface call sites have one dominant implementing class for this. Inlining

The inlining heuristics are complicated and difficult to summarize concisely. Roughly speaking a method will be inlined if:

there is a direct call to the method, OR the jit can devirtualize an interface or virtual call, AND
    the method being invoked is small (16 bytes of IL or fewer), OR
    the method being invoked is marked with AggressiveInlining, OR
    the method is medium sized (17 to ~100 bytes of IL) and the inline heuristics determine the inline is worthwhile

The above definition comes from Andy Ayers in a longstanding issue aimed to improve performances in a case like this (#7291).

As the runtime and the jitter improve over time, code that previously wasn't optimized, now can benefit certain optimizations. Indeed, it happened a month ago, with additional improvements in the upcoming framework.

Side note

Micro benchmarking requires certain technical and statistical skills as many things can go wrong (e.g. noise, dynamic frequency of the processor, optimizations, code warmup...). There are frameworks that allows you to perform such measurements in more statically-friendly and repeatable environment. The .NET framework uses Benchmark .NET, it may help you in having a better understanding of your code.

Yennefer
  • 5,704
  • 7
  • 31
  • 44