13

I'm comparing performance of MethodHandle::invoke and direct static method invokation. Here is the static method:

public class IntSum {
    public static int sum(int a, int b){
        return a + b;
    }
}

And here is my benchmark:

@State(Scope.Benchmark)
public class MyBenchmark {

    public int first;
    public int second;
    public final MethodHandle mhh;

    @Benchmark
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @BenchmarkMode(Mode.AverageTime)
    public int directMethodCall() {
        return IntSum.sum(first, second);
    }

    @Benchmark
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @BenchmarkMode(Mode.AverageTime)
    public int finalMethodHandle() throws Throwable {
        return (int) mhh.invoke(first, second);
    }

    public MyBenchmark() {
        MethodHandle mhhh = null;

        try {
            mhhh = MethodHandles.lookup().findStatic(IntSum.class, "sum", MethodType.methodType(int.class, int.class, int.class));
        } catch (NoSuchMethodException | IllegalAccessException e) {
            e.printStackTrace();
        }

        mhh = mhhh;
    }

    @Setup
    public void setup() throws Exception {
        first = 9857893;
        second = 893274;
    }
}

I got the following result:

Benchmark                      Mode  Cnt  Score   Error  Units
MyBenchmark.directMethodCall   avgt    5  3.069 ± 0.077  ns/op
MyBenchmark.finalMethodHandle  avgt    5  6.234 ± 0.150  ns/op

MethodHandle has some performance degradation.

Running it with -prof perfasm shows this:

....[Hottest Regions]...............................................................................
 31.21%   31.98%         C2, level 4  java.lang.invoke.LambdaForm$DMH::invokeStatic_II_I, version 490 (27 bytes) 
 26.57%   28.02%         C2, level 4  org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub, version 514 (84 bytes) 
 20.98%   28.15%         C2, level 4  org.openjdk.jmh.infra.Blackhole::consume, version 497 (44 bytes) 

As far as I could figure out the reason for the benchmark result is that the Hottest Region 2 org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub contains all the type-checks performed by the MethodHandle::invoke inside the JHM loop. Assembly output fragment (some code ommitted):

....[Hottest Region 2]..............................................................................
C2, level 4, org.sample.generated.MyBenchmark_finalMethodHandle_jmhTest::finalMethodHandle_avgt_jmhStub, version 519 (84 bytes) 
;...
0x00007fa2112119b0: mov     0x60(%rsp),%r10
;...
0x00007fa2112119d4: mov     0x14(%r12,%r11,8),%r8d  ;*getfield form
0x00007fa2112119d9: mov     0x1c(%r12,%r8,8),%r10d  ;*getfield customized
0x00007fa2112119de: test    %r10d,%r10d
0x00007fa2112119e1: je      0x7fa211211a65    ;*ifnonnull
0x00007fa2112119e7: lea     (%r12,%r11,8),%rsi
0x00007fa2112119eb: callq   0x7fa211046020    ;*invokevirtual invokeBasic
;...
0x00007fa211211a01: movzbl  0x94(%r10),%r10d  ;*getfield isDone
;...
0x00007fa211211a13: test    %r10d,%r10d
;jumping at the begging of jmh loop if not done
0x00007fa211211a16: je      0x7fa2112119b0    ;*aload_1 
;...

Before calling the invokeBasic we perform the type-checking (inside the jmh loop) which affects the output avgt.

QUESTION: Why isn't all the type-check moved outside of the loop? I declared public final MethodHandle mhh; inside the benchmark. So I expected the compiler can figured it out and eliminate the same type-checks. How to make the same typechecks eliminated? Is it possible?

St.Antario
  • 26,175
  • 41
  • 130
  • 318
  • The method has signature`MethodHandle.invoke(Object... args)`. Is it possible that the `int` values are also being auto-boxed/unboxed? Looks like there's a lot of black magic in this class. – flakes Mar 15 '18 at 05:56
  • 2
    @flakes This is signature-polymorphic method and has special treatment by `javac`. You can look at the compiled bytecode. The signature of the compiled method is `MethodHandle.invoke(II)I` – St.Antario Mar 15 '18 at 05:58
  • Ah, that's a new concept for me. Wild! – flakes Mar 15 '18 at 05:59
  • @flakes Btw, the `@PolymorphicSignature` is not public. We cannot create methods like this by ourselves :). – St.Antario Mar 15 '18 at 06:01
  • But why don’t you use `invokeExact`? And which Java version did you use? When using Java 8 and having an interface with a matching signature, you can convert direct method handles to interface implementations via `LambdaMetaFactory`, as shown in [this answer](https://stackoverflow.com/questions/19557829/faster-alternatives-to-javas-reflection/19563000#19563000). – Holger Mar 15 '18 at 11:39
  • @Holger I benchmarked `invokeExact` but the problem was that I did not get any performance improvement. Compiled code was also the same (the same type checks). Anyway, `invoke` works the same as `invokeExact` if the `MethodType` matches, doesn't it? – St.Antario Mar 15 '18 at 11:48
  • 1
    It depends on the JRE version; there were implementations were using `invoke` was significantly slower than `invokeExact`, so if you have a choice, prefer `invokeExact`. If it doesn’t help in your Java version, it doesn’t hurt either. By the way, how much warmup iterations did you have? To my experience, method handles need a lot of warmup… – Holger Mar 15 '18 at 11:54
  • @Holger I benchmarked with 5 warmup and 5 iterations. It seemed to be enough for coming to steady state... No? – St.Antario Mar 15 '18 at 12:37
  • @St.Antario `@PolymorphicSignature` - *compiler* overloads... :) of course we are not suppose to get a handle of those. btw `@ForceInline` is private also, but JMH somehow has `@CompilerControl(CompilerControl.Mode.INLINE)` (even if stated that this could be ignored) – Eugene Mar 15 '18 at 15:02
  • @Eugene I just thought it might be convenient to have polymorphic signature. So I can avoid unnecesary boxing conversion when returnin value... – St.Antario Mar 16 '18 at 15:16
  • I’ve encountered a threshold in the order of twenty with method handles, though, it was with composed handles and in the case of multiple transformations, each step seemed to have its own counter, so when dealing with method handles, I’d always make a test with a really large number of warmup iterations, just to be sure. The other conclusion is to use the `LambdaMetaFactory` for direct handles, whenever possible. – Holger Mar 20 '18 at 08:34

2 Answers2

16

You use reflective invocation of MethodHandle. It works roughly like Method.invoke, but with less run-time checks and without boxing/unboxing. Since this MethodHandle is not static final, JVM does not treat it as constant, that is, MethodHandle's target is a black box and cannot be inlined.

Even though mhh is final, it contains instance fields like MethodType type and LambdaForm form that are reloaded on each iteration. These loads are not hoisted out of the loop because of a black-box call inside (see above). Furthermore, LambdaForm of a MethodHandle can be changed (customized) in run-time between calls, so it needs to be reloaded.

How to make the call faster?

  1. Use static final MethodHandle. JIT will know the target of such MethodHandle and thus may inline it at the call site.

  2. Even if you have non-static MethodHandle, you may bind it to a static CallSite and invoke it as fast as direct methods. This is similar to how lambdas are called.

    private static final MutableCallSite callSite = new MutableCallSite(
            MethodType.methodType(int.class, int.class, int.class));
    private static final MethodHandle invoker = callSite.dynamicInvoker();
    
    public MethodHandle mh;
    
    public MyBenchmark() {
        mh = ...;
        callSite.setTarget(mh);
    }
    
    @Benchmark
    public int boundMethodHandle() throws Throwable {
        return (int) invoker.invokeExact(first, second);
    }
    
    1. Use regular invokeinterface instead of MethodHandle.invoke as @Holger suggested. An instance of interface for calling given MethodHandle can be generated with LambdaMetafactory.metafactory().
apangin
  • 92,924
  • 10
  • 193
  • 247
  • 1
    pardon my ignorance, but if the OP knows that the CallSite will not change, can this code be made to use a `ConstantCallSite` instead? If so, since it is a *constant* `CallSite` would that require for it to be static also? – Eugene Mar 15 '18 at 20:16
  • 1
    @Eugene `ConstantCallSite` requires to specify the target method in the constructor. In this sense `ConstantCallSite` is useless - this will be the same as creating static `MethodHandle` directly. `MutableCallSite` on the other hand allows to delay the decision about the target until later in runtime. – apangin Mar 15 '18 at 22:34
  • Ahhh... It means constant folding is applied only to `static final`. I thought for some reason that if we declare an immutable field as `final` the compiler can know that it is immutable and final and hoist some bound checking outside of the loop (in my case). Maybe you know where to find about JIT hoisting/constant folding? I looked at the [opto](http://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/23667c4b2f0e/src/share/vm/opto/) package but it seems blurred across it... – St.Antario Mar 16 '18 at 10:58
  • 3
    @St.Antario Final non-static fields are not considered constants by default, unless `-XX:+TrustFinalNonStaticFields` is set. See [`ciField::initialize_from`](http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/de8045923ad2/src/share/vm/ci/ciField.cpp#l201). – apangin Mar 16 '18 at 19:19
  • @Eugene BTW the `SignaturePolymorphic` methods have a strict definition in JVMS. https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.9 So, no other methoda except ones in `java.lang.invoke` can be. – St.Antario Mar 17 '18 at 11:14
  • I don't understand your second example. Can you please replace `mh = ...` with a concrete statement? It would help make the example more understandable. – Gili Dec 28 '18 at 03:30
  • @Gili `mh` points to the target method to be called. There is an example in the original question. – apangin Dec 28 '18 at 08:03
  • @apangin The original question only mentions `mhh` and `mhhh`, both of which are `MethodHandle`s. I hope you understand how this can get confusing. – Gili Dec 28 '18 at 14:40
5

Make MethodHandle mhh static:

Benchmark            Mode  Samples  Score   Error  Units
directMethodCall     avgt        5  0,942 ± 0,095  ns/op
finalMethodHandle    avgt        5  0,906 ± 0,078  ns/op

Non-static:

Benchmark            Mode  Samples  Score   Error  Units
directMethodCall     avgt        5  0,897 ± 0,059  ns/op
finalMethodHandle    avgt        5  4,041 ± 0,463  ns/op
Nikolai
  • 760
  • 4
  • 9
  • Cool, really works. Now `MethodHandle::invoke` and the actual `IntSum::sum` it invokes is simply inlined into the jmh loop. Why? What happened? Is it possible to do so in non-static case? – St.Antario Mar 15 '18 at 05:47
  • @St.Antario I agree, why adding static would work? :| I thought that this is just a problem with set-up here, so I code my own version of this (with a setup class), but the results are the same as in your case, twice the difference... – Eugene Mar 15 '18 at 14:59