6

Does anyone know why I get such different performance when running this code on Java 8 and Java 11?

Without using any runtime flags, it appears that this code runs significantly slower under Java 11 compared to Java 8.

import java.util.Date;

public class PerformanceExperiment {
    public static volatile String s = "";

    public static void main(String[] args)
    {                         
        System.out.println("Starting performance test");
        String s1 = "STRING ONE";
        String s2 = "STRING TWO";
        long now1 = (new Date()).getTime();
        for (long i = 0; i < 1_000_000_00; i++)
        {
            s = "abc " + s1 + " def " + s2;
        }
        long now2 = (new Date()).getTime();
        System.out.println("initial block took " + (now2 - now1) + "ms");
        for (long i = 0; i < 4_000_000_00; i++)
        {
            s = "abc " + s1 + " def " + s2;
        }
        long now3 = (new Date()).getTime();
        System.out.println("Main block took " + (now3 - now2) + "ms");
    }
}

I have tried numerous command line flags but have not managed to find anything that matches the Java 8 performance.

I have only tested this on Windows so it could behave differently on other operating systems.

khelwood
  • 55,782
  • 14
  • 81
  • 108
Alex
  • 69
  • 1
  • 3
  • 1
    Good to read: https://redfin.engineering/java-string-concatenation-which-way-is-best-8f590a7d22a8 – Mebin Joe Mar 28 '19 at 11:07
  • 1
    Does the problem really exist ?? whats your data – VSB Mar 28 '19 at 12:11
  • 1
    Been using Java almost exclusively for more than a decade and I just learned you can put underscores in int literals... *mindblown* – ryvantage Mar 28 '19 at 12:27
  • @Alex how come you don't provide your output? You say "significantly slower" but what is significant? Does the program output anything at all? – ryvantage Mar 28 '19 at 12:34
  • @Alex did you use JMH. Please use it, it will warm up your JVM and more. – Christian Mar 28 '19 at 13:05
  • Every one of your `String` values that you concatenate is a compile-time constant. How do you know what you're measuring? – Andrew Henle Apr 01 '19 at 16:44
  • This might be a bug in the JDK itself. I tried your code, and the results are: initial block took 1809ms Main block took 5918ms I tried your code with "final" added to "s1" and "s2", and the results are: initial block took 118ms Main block took 432ms I increased the iterations to Integer.MAX_VALUE and the results are: initial block took 31300ms Main block took 31127ms Now Integer.MAX_VALUE + final: initial block took 1578ms Main block took 1327ms – Bogdan B Apr 06 '22 at 20:22

3 Answers3

8

I modified your app to

  1. Use System.nanoTime() instead of new Date() for greater precision (see this answer for more info: https://stackoverflow.com/a/1776053/963076).
  2. Use Netbeans profiler.
  3. Loop through 10 iterations

Using Netbeans 8.2 with JDK 8 v181:

Starting performance test 0
initial block took 3147ms
Main block took 9469ms
Starting performance test 1
initial block took 2398ms
Main block took 9601ms
Starting performance test 2
initial block took 2463ms
Main block took 9671ms
Starting performance test 3
initial block took 2464ms
Main block took 9565ms
Starting performance test 4
initial block took 2410ms
Main block took 9672ms
Starting performance test 5
initial block took 2418ms
Main block took 9598ms
Starting performance test 6
initial block took 2384ms
Main block took 9733ms
Starting performance test 7
initial block took 2402ms
Main block took 9610ms
Starting performance test 8
initial block took 2509ms
Main block took 11222ms
Starting performance test 9
initial block took 2455ms
Main block took 10661ms

And the profiler showed this telemetry:

enter image description here

With Netbeans 10.0 using JDK 11.0.2:

Starting performance test 0
initial block took 3760ms
Main block took 15056ms
Starting performance test 1
initial block took 3734ms
Main block took 14602ms
Starting performance test 2
initial block took 3615ms
Main block took 14762ms
Starting performance test 3
initial block took 3748ms
Main block took 14534ms
Starting performance test 4
initial block took 3628ms
Main block took 14759ms
Starting performance test 5
initial block took 3625ms
Main block took 14959ms
Starting performance test 6
initial block took 3987ms
Main block took 14967ms
Starting performance test 7
initial block took 3803ms
Main block took 14701ms
Starting performance test 8
initial block took 3599ms
Main block took 14762ms
Starting performance test 9
initial block took 3627ms
Main block took 14434ms

enter image description here

My conclusion: JDK 11 is doing much more work to be memory efficient. Notice the # of "surviving generations" in the garbage collector is way (way) less with JDK11 and the memory usage is also significantly less in terms of usage and volatility. The trade-off seems to be in speed, but the difference in speed is less than the difference in memory usage.

ryvantage
  • 13,064
  • 15
  • 63
  • 112
6

TL;DR: Need better benchmarks, better setup with controlling the differences between releases, etc. Most of the benchmarking issues would be trivially solved by using JMH. Current test behavior seems to be explained by questionable benchmarking approach and the change in default GC.

Consider this:

public class PerformanceExperiment {
    public static volatile String s = "";

    public static void main(String[] args) {
        for (int c = 0; c < 5; c++) {
            test();
        }
    }

    public static void test() {
        String s1 = "STRING ONE";
        String s2 = "STRING TWO";
        long time1 = System.currentTimeMillis();
        for (long i = 0; i < 4_000_000_00; i++) {
            s = "abc " + s1 + " def " + s2;
        }
        long time2 = System.currentTimeMillis();
        System.out.println("Main block took " + (time2 - time1) + "ms");
    }
}

First, it uses more convenient timings. Then, it measures the same bytecode block, while original tests warms up "initial one" and then proceeds to measure the absolutely cold one.

Then, JIT compilation would hit the method and you want to re-enter the method again to get the optimized code running, otherwise you are running the intermediate "on-stack-replacement" code -- you can do it with external iteration that calls into test. And on top of that, you want to enter multiple times to capture the most optimized version.

And, since the test allocates a lot, you want to nail the heap size.

So, here:

$ ~/Install/jdk8u191-rh/bin/javac PerformanceExperiment.java
$ ~/Install/jdk8u191-rh/bin/java -Xms2g -Xmx2g PerformanceExperiment
Main block took 10024ms
Main block took 9768ms
Main block took 7249ms
Main block took 7235ms
Main block took 7205ms

...and here is 11.0.2 on the same bytecode:

$ ~/Install/jdk11.0.2/bin/java -Xms2g -Xmx2g PerformanceExperiment
Main block took 9775ms
Main block took 10825ms
Main block took 8635ms
Main block took 8616ms
Main block took 8622ms

...and here is 11.0.2 with matching GC (9+ changed the default to G1 with JEP 248):

$ ~/Install/jdk11.0.2/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC PerformanceExperiment
Main block took 9281ms
Main block took 9129ms
Main block took 6725ms
Main block took 6688ms
Main block took 6684ms

On top of that, there is volatile store on every tiny iteration, that costs quite a lot and probably distorts the benchmark.

There is also interaction with indified string concat (JEP 280), thread-local handshakes (JEP 312), and other VM fixes, but you would probably see that only when compiling past target=8, which goes beyond the scope of this exercise.

Aleksey Shipilev
  • 18,599
  • 2
  • 67
  • 86
  • 1
    I agree with all the points you raise. However, in my opinion there is a concern that a rather simple and small code sample can take longer to run in a newer JDK out-of-the-box, without using any flags. – Alex Apr 01 '19 at 17:46
  • 4
    And? JDKs do differ, they make different tradeoffs which results in quantifiably different performance behaviors. Unless you can demonstrate _that particular code sample_ is overwhelmingly more important than everything else, then there is no issue here. In other words, your test definitely measures something, but the real question is whether it measures something you need. – Aleksey Shipilev Apr 01 '19 at 20:40
  • @alex has a point... we suffered from this change in behavior in a robotic Java app. The extra seconds needed for first few calls to string concat (in our logging code) threw us off during competition. Thanks to this thread, we eventually found the root cause, but it was too late. – Billy Apr 07 '19 at 18:12
  • @Billy don't ever change set-up before "production"? That is a golden rule, after all. – Eugene Dec 07 '20 at 01:39
0

This has now been confirmed as a bug in the JDK:

https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8221733

Alex
  • 69
  • 1
  • 3