45

I'm writing some code that calls Field.set and Field.get many many thousands of times. Obviously this is very slow because of the reflection.

I want to see if I can improve performance using MethodHandle in Java 7. So far here's what I have:

Instead of field.set(pojo, value), I'm doing:

private static final Map<Field, MethodHandle> setHandles = new HashMap<>();

MethodHandle mh = setHandles.get(field);
if (mh == null) {
    mh = lookup.unreflectSetter(field);
    setHandles.put(field, mh);
}
mh.invoke(pojo, value);

However, this doesn't seem to perform better than the Field.set call using reflection. Am I doing something wrong here?

I read that using invokeExact could be faster but when I tried using that I got a java.lang.invoke.WrongMethodTypeException.

Has anyone successfully been able to optimize repeated calls to Field.set or Field.get?

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
aloo
  • 5,331
  • 7
  • 55
  • 94
  • 1
    It may be the case that in Java 7 `MethodHandle`s are just slow. One time we also tried to replace reflective calls with them and they turned out to be actually worse. Hopefully things get better in Java 8, where `MethodHandle`s are used to create lambda classes. Try testing it on JDK8. – ghik Mar 10 '14 at 20:43
  • 3
    Could you explain why you are using Field.get/set and reflection? What is it you are trying to accomplish (higher problem level). Remember the admonition from Java "Given an instance of a class, it is possible to use reflection to set the values of fields in that class. This is typically done only in special circumstances when setting the values in the usual way is not possible. Because such access usually violates the design intentions of the class, it should be used with the utmost discretion" – ErstwhileIII Mar 11 '14 at 02:07
  • 1
    I agree with @ErstwhileIII. If you're doing this so many thousands of times that performance is a bottleneck, you're already doing something wrong. Consider defining and implementing an interface. – user207421 Mar 11 '14 at 02:44
  • 2
    I'm using Objecitfy (an ORM for appengine). A library that does conversion between registered POJO's and appengine datastore entities. This is especially nice because you don't have to create converters between the two structures, its converted automatically using reflection. I'd like to modify this library so that its more performant, after profiling it, its clear that the field.set/get is responsible for most of the perf bottleneck – aloo Mar 11 '14 at 03:19

4 Answers4

74

2015-06-01: Updated to reflect @JoeC's comment about another case when handles are static. Also updated to latest JMH and re-ran on modern hardware. The conclusion stays almost the same.

Please do proper benchmarking, it is arguably not that hard with JMH. Once you do that, the answer becomes obvious. It can also showcase the proper use of invokeExact (requires target/source 1.7 to compile and run):

@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class MHOpto {

    private int value = 42;

    private static final Field static_reflective;
    private static final MethodHandle static_unreflect;
    private static final MethodHandle static_mh;

    private static Field reflective;
    private static MethodHandle unreflect;
    private static MethodHandle mh;

    // We would normally use @Setup, but we need to initialize "static final" fields here...
    static {
        try {
            reflective = MHOpto.class.getDeclaredField("value");
            unreflect = MethodHandles.lookup().unreflectGetter(reflective);
            mh = MethodHandles.lookup().findGetter(MHOpto.class, "value", int.class);
            static_reflective = reflective;
            static_unreflect = unreflect;
            static_mh = mh;
        } catch (IllegalAccessException | NoSuchFieldException e) {
            throw new IllegalStateException(e);
        }
    }

    @Benchmark
    public int plain() {
        return value;
    }

    @Benchmark
    public int dynamic_reflect() throws InvocationTargetException, IllegalAccessException {
        return (int) reflective.get(this);
    }

    @Benchmark
    public int dynamic_unreflect_invoke() throws Throwable {
        return (int) unreflect.invoke(this);
    }

    @Benchmark
    public int dynamic_unreflect_invokeExact() throws Throwable {
        return (int) unreflect.invokeExact(this);
    }

    @Benchmark
    public int dynamic_mh_invoke() throws Throwable {
        return (int) mh.invoke(this);
    }

    @Benchmark
    public int dynamic_mh_invokeExact() throws Throwable {
        return (int) mh.invokeExact(this);
    }

    @Benchmark
    public int static_reflect() throws InvocationTargetException, IllegalAccessException {
        return (int) static_reflective.get(this);
    }

    @Benchmark
    public int static_unreflect_invoke() throws Throwable {
        return (int) static_unreflect.invoke(this);
    }

    @Benchmark
    public int static_unreflect_invokeExact() throws Throwable {
        return (int) static_unreflect.invokeExact(this);
    }

    @Benchmark
    public int static_mh_invoke() throws Throwable {
        return (int) static_mh.invoke(this);
    }

    @Benchmark
    public int static_mh_invokeExact() throws Throwable {
        return (int) static_mh.invokeExact(this);
    }

}

On 1x4x2 i7-4790K, JDK 8u40, Linux x86_64 it yields:

Benchmark                             Mode  Cnt  Score   Error  Units
MHOpto.dynamic_mh_invoke              avgt   25  4.393 ± 0.003  ns/op
MHOpto.dynamic_mh_invokeExact         avgt   25  4.394 ± 0.007  ns/op
MHOpto.dynamic_reflect                avgt   25  5.230 ± 0.020  ns/op
MHOpto.dynamic_unreflect_invoke       avgt   25  4.404 ± 0.023  ns/op
MHOpto.dynamic_unreflect_invokeExact  avgt   25  4.397 ± 0.014  ns/op
MHOpto.plain                          avgt   25  1.858 ± 0.002  ns/op
MHOpto.static_mh_invoke               avgt   25  1.862 ± 0.015  ns/op
MHOpto.static_mh_invokeExact          avgt   25  1.859 ± 0.002  ns/op
MHOpto.static_reflect                 avgt   25  4.274 ± 0.011  ns/op
MHOpto.static_unreflect_invoke        avgt   25  1.859 ± 0.002  ns/op
MHOpto.static_unreflect_invokeExact   avgt   25  1.858 ± 0.002  ns/op

...which suggests MH are really much faster than Reflection in this particular case (this is because the access checks against the private field is done at lookup time, and not at the invocation time). dynamic_* cases simulate the case when the MethodHandles and/or Fields are not statically known, e.g. pulled from Map<String, MethodHandle> or something like it. Conversely, static_* cases are those where the invokers are statically known.

Notice the reflective performance is on par with MethodHandles in dynamic_* cases, this is because reflection is heavily optimized further in JDK 8 (because really, you don't need the access check to read your own fields), so the answer may be "just" switching to JDK 8 ;)

static_* cases are even faster, because the MethoHandles.invoke calls are aggressively inlined. This eliminates part of the type checking in MH cases. But, in reflection cases, there are still quick checks present, and therefore, it lags behind.

Aleksey Shipilev
  • 18,599
  • 2
  • 67
  • 86
  • 4
    Unbelievable, you get tons of upvotes for simply saying the same than I did while my answer is questioned even if your highly rated answer proves the results. And you didn’t even answer how to *solve* aloo’s actual problem of not being able to use `invokExact`… – Holger Mar 12 '14 at 09:22
  • 7
    I submit that the answer is obvious when you do proper benchmarking. – Aleksey Shipilev Mar 12 '14 at 09:34
  • 2
    OP's original question was about optimizing Field.get() performance, possibly with MethodHandles, and my answer delivers on that. But your comment about `invokeExact` is somewhat fair, I was implicitly thinking showing up the working code which uses `invokeExact` answers the question how to use it. – Aleksey Shipilev Mar 12 '14 at 09:44
  • 8
    @Holger This answer is voted up because it uses accepted benchmarking techniques. The JVM is too smart for its own good, and can sometimes even outsmart JMH benchmarks. Your benchmarking technique shows disregard of JVM complexity, and a misunderstanding of the problem when nanotime was used to measure nanosecond-scale calls. – Xorlev Mar 12 '14 at 17:20
  • @AlekseyShipilev the only thing I don't think this answers (because I wasn't explicit in my question) is that on JDK7, you would rarely use Field.get() without setting setAccessible(true) first. When doing this, I can't seem to get the MH approach to be faster than straight reflection. – aloo Mar 17 '14 at 15:59
  • 1
    @aloo: True. Although it highlights the difference between invocation-time access checks in Reflection and lookup-time in MH. I tend to think `setAccessible(true)` is suspicious in security-audited library code ;) – Aleksey Shipilev Mar 17 '14 at 16:50
  • 2
    Although is is a great benchmark, you're not letting the JIT do what it does best: inline the hell out of everything. When you mark the Handles/Methods as private static final, you get a very different picture on performance. invokeExact ends up coming out on top, 2.6x faster over reflection. i also added invokeSpecial for comparision, link to full results https://gist.github.com/mooman219/f85c6560cb550a9e3b28 – Joe C Jun 01 '15 at 03:15
  • 3
    @JoeC: Yes, thanks, that's a sensible case to try as well. I added it to the updated answer. The dynamic case is important as well, because very often people are using `Map` to lookup for a getter. – Aleksey Shipilev Jun 01 '15 at 12:25
  • I am unable to do `invoke` or `invokeExact` on the methodHandles retrieved via `lookup().unreflectGetter`, `lookup().unreflectSetter` or `lookup().unreflect`. It gives an exception of "Cannot do invoke reflectively" – Optimizer Jul 24 '15 at 08:22
  • why cannot static reflection also be inlined? Is there any technical reason? – choxsword Feb 16 '21 at 10:23
20

Update: since some people started a pointless discussion about “how to benchmark” I will emphasize the solution to your problem contained in my answer, now right at the beginning:

You can use invokeExact even in your reflective context where you don’t have the exact type signature by converting the MethodHandle using asType to a handle taking Object as arguments. In environments affected by the performance difference between invoke and invokeExact, using invokeExact on such a converting handle is still way faster than using invoke on a direct method handle.


Original answer:

The problem is indeed that you are not using invokeExact. Below is a little benchmark program showing the results of different ways of incrementing an int field. Using invoke instead of invokeExact leads to a performance drop below the speed of Reflection.

You receive the WrongMethodTypeException because the MethodHandle is strongly typed. It expects an exact invocation signature matching type type of the field and owner. But you can use the handle to create a new MethodHandle wrapping the necessary type conversions. Using invokeExact on that handle using a generic signature (i.e. (Object,Object)Object) will be still way more efficient than using invoke with a dynamic type conversion.

The results on my machine using 1.7.0_40 were:

direct        :   27,415ns
reflection    : 1088,462ns
method handle : 7133,221ns
mh invokeExact:   60,928ns
generic mh    :   68,025ns

and using a -server JVM yields to a baffling

direct        :   26,953ns
reflection    :  629,161ns
method handle : 1513,226ns
mh invokeExact:   22,325ns
generic mh    :   43,608ns

I don’t think that it has much real life relevance seeing a MethodHandle being faster than a direct operation but it proves that MethodHandles are not slow on Java7.

And the generic MethodHandle will still outperform Reflection (whilst using invoke does not).

import java.lang.invoke.MethodHandle;
import java.lang.invoke.MethodHandles;
import java.lang.reflect.Field;

public class FieldMethodHandle
{
  public static void main(String[] args)
  {
    final int warmup=1_000_000, iterations=1_000_000;
    for(int i=0; i<warmup; i++)
    {
      incDirect();
      incByReflection();
      incByDirectHandle();
      incByDirectHandleExact();
      incByGeneric();
    }
    long direct=0, refl=0, handle=0, invokeExact=0, genericH=0;
    for(int i=0; i<iterations; i++)
    {
      final long t0=System.nanoTime();
      incDirect();
      final long t1=System.nanoTime();
      incByReflection();
      final long t2=System.nanoTime();
      incByDirectHandle();
      final long t3=System.nanoTime();
      incByDirectHandleExact();
      final long t4=System.nanoTime();
      incByGeneric();
      final long t5=System.nanoTime();
      direct+=t1-t0;
      refl+=t2-t1;
      handle+=t3-t2;
      invokeExact+=t4-t3;
      genericH+=t5-t4;
    }
    final int result = VALUE.value;
    // check (use) the value to avoid over-optimizations
    if(result != (warmup+iterations)*5) throw new AssertionError();
    double r=1D/iterations;
    System.out.printf("%-14s:\t%8.3fns%n", "direct", direct*r);
    System.out.printf("%-14s:\t%8.3fns%n", "reflection", refl*r);
    System.out.printf("%-14s:\t%8.3fns%n", "method handle", handle*r);
    System.out.printf("%-14s:\t%8.3fns%n", "mh invokeExact", invokeExact*r);
    System.out.printf("%-14s:\t%8.3fns%n", "generic mh", genericH*r);
  }
  static class MyValueHolder
  {
    int value;
  }
  static final MyValueHolder VALUE=new MyValueHolder();

  static final MethodHandles.Lookup LOOKUP=MethodHandles.lookup();
  static final MethodHandle DIRECT_GET_MH, DIRECT_SET_MH;
  static final MethodHandle GENERIC_GET_MH, GENERIC_SET_MH;
  static final Field REFLECTION;
  static
  {
    try
    {
      REFLECTION = MyValueHolder.class.getDeclaredField("value");
      DIRECT_GET_MH = LOOKUP.unreflectGetter(REFLECTION);
      DIRECT_SET_MH = LOOKUP.unreflectSetter(REFLECTION);
      GENERIC_GET_MH = DIRECT_GET_MH.asType(DIRECT_GET_MH.type().generic());
      GENERIC_SET_MH = DIRECT_SET_MH.asType(DIRECT_SET_MH.type().generic());
    }
    catch(NoSuchFieldException | IllegalAccessException ex)
    {
      throw new ExceptionInInitializerError(ex);
    }
  }

  static void incDirect()
  {
    VALUE.value++;
  }
  static void incByReflection()
  {
    try
    {
      REFLECTION.setInt(VALUE, REFLECTION.getInt(VALUE)+1);
    }
    catch(IllegalAccessException ex)
    {
      throw new AssertionError(ex);
    }
  }
  static void incByDirectHandle()
  {
    try
    {
      Object target=VALUE;
      Object o=GENERIC_GET_MH.invoke(target);
      o=((Integer)o)+1;
      DIRECT_SET_MH.invoke(target, o);
    }
    catch(Throwable ex)
    {
      throw new AssertionError(ex);
    }
  }
  static void incByDirectHandleExact()
  {
    try
    {
      DIRECT_SET_MH.invokeExact(VALUE, (int)DIRECT_GET_MH.invokeExact(VALUE)+1);
    }
    catch(Throwable ex)
    {
      throw new AssertionError(ex);
    }
  }
  static void incByGeneric()
  {
    try
    {
      Object target=VALUE;
      Object o=GENERIC_GET_MH.invokeExact(target);
      o=((Integer)o)+1;
      o=GENERIC_SET_MH.invokeExact(target, o);
    }
    catch(Throwable ex)
    {
      throw new AssertionError(ex);
    }
  }
}
Holger
  • 285,553
  • 42
  • 434
  • 765
  • 1
    I have some remarks to make. Using final static for the handle has a positive impact, but can that be used by aloo? I doubt it. Also you imho make a typical microbenchmarking mistake and benchmark several things at the same time. If you do so the order of the tests tend to influence the JVM. I did not check if that is here the case as well of course. – blackdrag Mar 11 '14 at 10:17
  • 1
    Doing several things at the same time may influence the result but not necessarily in a bad way. The intended use case will be ORM so I don’t expect the real case to consist of `MethodHandle` usage only. Neither does the `static final` nature of the fields matter too much, given the clear *magnitude* of the results. I wouldn’t derive any statement from it if the results were factors of less than ten but factors of thirty or more allow drawing conclusions. – Holger Mar 11 '14 at 10:41
  • 1
    I have to make some additional comments.. You try to test a single operation using nanoTime, but that has a precision too. If you comment out all the method calls between what is taking time, you will see you get nonzero times and the times are partially in the area of your results – blackdrag Mar 11 '14 at 11:46
  • 2
    Then I just double checked... try it out and after you modified your benchmark to actually take times of what happens change incByDirectHandleExact to store the handle in local variable before execution. In my case I moved the outside loop inside the methods and had the assignment of course out of that loop. The result was that it takes 15 times longer, then when using the final static fields. That is due to a special optimization that can only be done in case you use a final static field. I thought it may have been removed again in later jdk7, but doesn't seem to be the case – blackdrag Mar 11 '14 at 12:02
  • 1
    Microbenchmarks are better handled by things like JMH, your results aren't any good unfortunately. I'd question them even if the nanotime calls weren't inline with nanosecond-scale calls. – Xorlev Mar 11 '14 at 22:32
  • 1
    @Xorlev: Since Aleksey Shipilev’s answer proves the results using JMH I don’t get your point. JMH might be the better benchmark tool but the question was not about how to benchmark but how to solve the performance issue. And I answered that question. – Holger Mar 12 '14 at 09:20
  • 4
    Well, my result does not prove yours: your results are acquired using broken techniques. The mere fact it yields similar result does not mean my result somehow validates the way your results are acquired. – Aleksey Shipilev Mar 12 '14 at 09:55
  • @Holger what causes mh to be much faster than reflecion? – choxsword Feb 16 '21 at 10:33
  • 2
    @scottxiao no per-invocation security check, no autoboxing of primitive values, no array creation for the arguments, depending on the circumstances, better JVM optimizations. But note that this is not guaranteed. The topic of this Q&A isn’t even reflection vs mh, it’s about a failure of mh to deliver high performance (affecting older JVMs only). For your question, [this answer](https://stackoverflow.com/a/19563000/2711488) might fit better. It shows that, depending on the circumstances, reflection can be as fast as mh. – Holger Feb 16 '21 at 10:42
6

EDIT thanks to holger I noticed that I really should have used invokeExact, so I decided to remove the stuff about other jdks and use invokeExact only... using -server or not still does not really make a difference for me though

The main difference between using reflection and using MethodHandles is that for reflection you have a security check for every call, in case of MethodHandles, only for the creation of the handle.

If you look at this

class Test {
    public Object someField;
    public static void main(String[] args) throws Exception {
        Test t = new Test();
        Field field = Test.class.getDeclaredField("someField");
        Object value = new Object();
        for (int outer=0; outer<50; outer++) {
            long start = System.nanoTime();
            for (int i=0; i<100000000; i++) {
                field.set(t, value);
            }
            long time = (System.nanoTime()-start)/1000000;
            System.out.println("it took "+time+"ms");
        }
    }
}

Then I get on my computer times 45000ms on jdk7u40 (jdk8 and pre 7u25 perform much better though)

Now let's look at the same program using handles

class Test {
    public Object someField;
    public static void main(String[] args) throws Throwable {
        Test t = new Test();
        Field field = Test.class.getDeclaredField("someField");
        MethodHandle mh = MethodHandles.lookup().unreflectSetter(field);
        Object value = new Object();
        for (int outer=0; outer<50; outer++) {
            long start = System.nanoTime();
            for (int i=0; i<100000000; i++) {
                mh.invokeExact(t, value);
            }
            long time = (System.nanoTime()-start)/1000000;
            System.out.println("it took "+time+"ms");
        }
    }
}

7u40 says roughly 1288ms. So I can confirm Holger's 30 times on 7u40. On 7u06 this code handles would be slower because reflection was several times faster and on jdk8 everything is new again.

As for why you didn't see an improvement... difficult to say. What I did was microbenchmarking. That doesn't tell anything about a real application at all. But using those results I would assume you either use an old jdk version, or you don't reuse the handle often enough. Because while executing a handle can be faster, the creation of the handle can cost much more then the creation of a Field.

Now the biggest problem point... I did see you want this for google appengine... And I must say, you can test locally as much as you want, what counts in the end is what the performance of the application on the google site will be. Afaik they use a modified OpenJDK, but what version with what modification they don't say. With Jdk7 being that unstable you could be unlucky or not. Maybe they added special code for reflection, then all bets are off anyway. And even ignoring that... maybe the payment model changed again, but usually you want to avoid datastore access by caching because it costs. If that still holds, is it then realistic that any handle will be called let's say 10.000 times on average?

blackdrag
  • 6,413
  • 2
  • 26
  • 38
  • 1
    For current 64Bit JVMs there is no client JVM so obviously `-server` makes no difference for 64Bit. And afaik tiered compilation is supposed to replace the client/server JVM model finally. So most probably there never will be a client 64Bit JVM. – Holger Mar 11 '14 at 13:05
  • yes, I know. Also that tiered compilation, often gives more trouble than it solves atm. I hope this will change in the future, but for now I usually have it turned off – blackdrag Mar 12 '14 at 10:46
6

There's a catch 22 for MethodHandles in JDK 7 and 8 (I haven't tested JDK 9 or higher yet): A MethodHandle is fast (as fast as direct access) if it is in a static field. Otherwise they are as slow as reflection. If your framework reflects over n getter or setters, where is n is unknown at compile time, then MethodHandles are probably useless to you.

I wrote an article that benchmarked all the different approaches to speed up reflection.

Use LambdaMetafactory (or more exotic approaches such as code generation) to speed up calling getters and setters. Here's the gist for a getter (for a setter use a BiConsumer):

public final class MyAccessor {

    private final Function getterFunction;

    public MyAccessor() {
        MethodHandles.Lookup lookup = MethodHandles.lookup();
        CallSite site = LambdaMetafactory.metafactory(lookup,
                "apply",
                MethodType.methodType(Function.class),
                MethodType.methodType(Object.class, Object.class),
                lookup.findVirtual(Person.class, "getName", MethodType.methodType(String.class)),
                MethodType.methodType(String.class, Person.class));
        getterFunction = (Function) site.getTarget().invokeExact();
    }

    public Object executeGetter(Object bean) {
        return getterFunction.apply(bean);
    }

}
Geoffrey De Smet
  • 26,223
  • 11
  • 73
  • 120