8

In a Java REST service performance test, I got an unexpected pattern: a method that creates and returns always the same value object in each invocation runs faster than another version that just returns the value object stored in a class or object field.

Code:

@POST @Path("inline") public Response inline(String s) { 
    return Response.status(Status.CREATED).build(); 
}    

private static final Response RESP = Response.status(Status.CREATED).build();
@POST @Path("staticfield") public Response static(String s) { 
    return RESP; 
}

private final Response resp = Response.status(Status.CREATED).build();
@POST @Path("field") public Response field(String s) { 
    return resp; 
}

Byte code:

  • Inline (faster): getstatic, invokestatic, invokevirtual, areturn
  • Static filed (slower): getstatic, areturn
  • Object field (slower): aload, getfield, areturn

Performance (using Apache AB, single thread, several runs with consistent results):

  • Inline: 17078.29 [#/sec] (mean)
  • Static field: 5242.64 [#/sec] (mean)
  • Object field: 5417.40 [#/sec] (mean)

Environment: RHEL6 + JDK Oracle 1.7.0_60-b19 64bits

Is is possible that the JVM optimized the inline version with native code, but never considered optimizing the other two because they are already pretty small?

Sebastian
  • 1,835
  • 3
  • 22
  • 34
  • I think it's most likely that something's not working the way you think it is, outside of the above code. – Hot Licks Jul 25 '14 at 17:18
  • "Returns always the same value object"... Perhaps the REST layer knows that the result can be cached then? – Thorbjørn Ravn Andersen Jul 25 '14 at 17:28
  • 1
    Post a complete, compilable benchmark. Only then can we dig into what's going on. – tmyklebu Jul 25 '14 at 17:54
  • @ThorbjørnRavnAndersen I've verified that the response is not cached. The REST framework (which I have excluded to simplify) should not interfere because the class expose exactly the same functional behavior in the three methods. The only difference is the implementation, which is visible only to the JVM. – Sebastian Jul 25 '14 at 19:13
  • @tmyklebu thanks for jumping in. Can you elaborate on this? what would you like to have in addition? I'm trying to keep the question as simple as possible (because I presume this is due a JVM optimization), but I'd be happy to add anything suggested. – Sebastian Jul 25 '14 at 19:16
  • @HotLicks I understand what you are saying, however anything outside the above code should see the three methods exactly in the same way (they have the same signature and exactly functional behavior), being the reason I conclude whatever is impacting these numbers is having access to the implementation. In fact, if I switch the implementation but keep everything else the same, the numbers change accordingly. – Sebastian Jul 25 '14 at 19:20
  • You've asked for an explanation. The most plausible one is that something outside of the above code accounts for the difference. Eg, if for every iteration a new instance was created that would easily explain the Object case. And weird stuff with "unsafe" (which is apt to be present in many server environments) could easily account for both cases. – Hot Licks Jul 25 '14 at 19:25
  • 2
    Could you try to get a listing of JIT-compiled asm for the above methods using one of the techniques mentioned here? http://stackoverflow.com/questions/1503479/how-to-see-jit-compiled-code-in-jvm – Alex D Jul 25 '14 at 19:40
  • @Sebastian make a complete example _others_ can download and run. Either you find out why you see what you do in the process of trimming or others can have a closer look at your trimmed example. If you do not do that, your question will most likely be closed. – Thorbjørn Ravn Andersen Jul 25 '14 at 20:24

1 Answers1

4

As pointed out in the comments, it is difficult to tell without actually looking at the assembly. As yoy are using a REST-framework, I assume however that is would be hard to tell from the assembly as there is quite a lot of code to read.

Instead, I want to give you an educated guess because your code is an archetypical example of applying costant folding. When a value is inlined and not read from a field, the JVM can safely assume that this value is constant. When JIT compiling the method, the constant expression can therefore be safely merged with your framework code what probably leads to less JIT assebly and therefore improved performance. For a field value, even a final one, a constant value cannot be assumed as the field value can change. (As long as the field value is not a compile time constant, a primitive or a constant String, which are inlined by javac.) The JVM can therefore probably not constant fold the value.

You can read more on constant folding in the tutorial to the JMH where it is noted:

If JVM realizes the result of the computation is the same no matter what, it can cleverly optimize it. In our case, that means we can move the computation outside of the internal JMH loop. This can be prevented by always reading the inputs from the state, computing the result based on that state, and the follow the rules to prevent DCE.

I hope you used such a framework. Otherwise, you performance metric is unlikely to be valid.

From reading the byte code, you can generally not learn much about runtime performance as the JIT compiler can tweak the byte code to anything during optimization. The byte code layout should only matter when code is interpreted which is generally not the state where one would measure performance as performance-critical, hot code is always JIT compiled.

Rafael Winterhalter
  • 42,759
  • 13
  • 108
  • 192
  • This is the best answer so far. I've repeated the performance tests with no frameworks at all, and the inline method is slower (the opposite results). So, whatever optimization is happening, it happens only when the framework code is active. As you say, it difficult to be sure that this is what is really happening in this case, but your explanation fits the context nicely and I learned something new today thanks to your response. Thanks! – Sebastian Jul 27 '14 at 00:41