0

When escape analysis is turned on, GC still occurs once every millisecond, and the running time is the same as the time without this optimization.

I reached this conclusion by using JMH to test, below is my code and test result.

Code:

/**
 * @author Jmc
 */
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 3, time = 10)
@Measurement(iterations = 3, time = 10)
@Fork(1)
public class EscapeAnalysisTest {
    /**
     * VM Options: -Xlog:gc:empty.gc.log -Xmx30M
     */
    @Benchmark
    public void empty() {
        // do nothing
    }

    /**
     * VM Options: -Xlog:gc:withEA.gc.log -Xmx30M
     */
    @Benchmark
    public void withEA() {
        Object o = null;
        for (int i = 0; i < 1_000_000_000; i++) {
            o = new Object();
        }
        int hashcode = o.hashCode();
    }

    /**
     * VM Options: -Xlog:gc:withoutEA.gc.log -Xmx30M -XX:-DoEscapeAnalysis
     */
    @Benchmark
    public void withoutEA() {
        Object o = null;
        for (int i = 0; i < 1_000_000_000; i++) {
            o = new Object();
        }
        int hashcode = o.hashCode();
    }
}

Output:

  • empty - JMH
# JMH version: 1.32
# VM version: JDK 17, Java HotSpot(TM) 64-Bit Server VM, 17+35-LTS-2724
# VM invoker: D:\Programs\JDK\jdk-17\bin\java.exe
# VM options: -Xlog:gc:empty.gc.log -Xmx30M --enable-preview -javaagent:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\lib\idea_rt.jar=54629:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint
# Warmup: 3 iterations, 10 s each
# Measurement: 3 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.test.jmh.EscapeAnalysisTest.empty

...

Benchmark                 Mode  Cnt  Score   Error  Units
EscapeAnalysisTest.empty  avgt    3  0.517 ± 1.005  ns/op

  • withEA - JMH
# JMH version: 1.32
# VM version: JDK 17, Java HotSpot(TM) 64-Bit Server VM, 17+35-LTS-2724
# VM invoker: D:\Programs\JDK\jdk-17\bin\java.exe
# VM options: -Xlog:gc:withEA.gc.log -Xmx30M --enable-preview -javaagent:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\lib\idea_rt.jar=54648:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint
# Warmup: 3 iterations, 10 s each
# Measurement: 3 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.test.jmh.EscapeAnalysisTest.withEA

...

Benchmark                  Mode  Cnt          Score           Error  Units
EscapeAnalysisTest.withEA  avgt    3  900223198.990 ± 468305893.304  ns/op
  • withoutEA - JMH
# JMH version: 1.32
# VM version: JDK 17, Java HotSpot(TM) 64-Bit Server VM, 17+35-LTS-2724
# VM invoker: D:\Programs\JDK\jdk-17\bin\java.exe
# VM options: -Xlog:gc:withoutEA.gc.log -Xmx30M -XX:-DoEscapeAnalysis --enable-preview -javaagent:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\lib\idea_rt.jar=54708:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint
# Warmup: 3 iterations, 10 s each
# Measurement: 3 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.test.jmh.EscapeAnalysisTest.withoutEA

...

Benchmark                     Mode  Cnt          Score           Error  Units
EscapeAnalysisTest.withoutEA  avgt    3  905957459.624 ± 472436917.469  ns/op
  • empty GC log
[0.039s][info][gc] Using G1
[0.010s][info][gc] Using G1
  • withEA GC log
[0.011s][info][gc] Using G1
[0.011s][info][gc] Using G1
[0.596s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 13M->2M(30M) 5.488ms
[0.603s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 14M->2M(30M) 3.900ms
[0.606s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 15M->2M(30M) 0.403ms

...

[63.402s][info][gc] GC(14235) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.519ms
[63.406s][info][gc] GC(14236) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.558ms
[63.410s][info][gc] GC(14237) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.509ms
[63.414s][info][gc] GC(14238) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.650ms
[63.418s][info][gc] GC(14239) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.832ms

  • withoutEA GC log
[0.013s][info][gc] Using G1
[0.012s][info][gc] Using G1
[0.607s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 13M->2M(30M) 4.942ms
[0.614s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 14M->2M(30M) 3.855ms
[0.617s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 15M->2M(30M) 0.345ms

...

[63.837s][info][gc] GC(15086) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.971ms
[63.841s][info][gc] GC(15087) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.038ms
[63.845s][info][gc] GC(15088) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.953ms
[63.850s][info][gc] GC(15089) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.233ms
[63.858s][info][gc] GC(15090) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.499ms

Note: From the code and result above, you can see the empty method doesn't occurs gc and in terms of other two java method, the running time is about the same. The GC still occurs when opened EA, I want to know the reason.

Jmc
  • 1
  • 3
  • 1
    This is a very broken microbenchmark. There's so many things wrong, it's hard to say what you're even trying to prove. For one, there's no warmup. There's a good chance that your code isn't even fully optimized before it finishes running. See [this question](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) for some pointers (hint: use an existing framework like JMH, and even if you do, it can be tricky). The most likely reason for the GC not occuring when the loop count is low is that the program simply ends before a GC becomes necessary. – Joachim Sauer Dec 04 '21 at 09:07
  • Yeah, you're right. But with escape analysis, the GC shouldn't occurs, because the Objects are created on the STACK memory and destroyed one by one, it doesn't show up on HEAP memory so the GC won't show up too. – Jmc Dec 04 '21 at 15:08
  • 1
    Escape analysis isn't done the same way at all levels of optimizations. And optimizations often can't be switched on during a single method invocation. It's quite possible that if you test this in a sensible way (i.e. with a proper microbenchmarking framework) it'll show that the whole loop will be optimized away. But you only call it once, which makes it pretty unlikely that it'll ever run very optimized code. You are making conclusions from faulty assumptions. – Joachim Sauer Dec 04 '21 at 15:58
  • Thanks for your suggestions! I'll test it correctly and update my problem. – Jmc Dec 04 '21 at 16:30
  • 1
    @Jmc EA doesn't mean that the objects are created on the stack. As Joachim says, the optimisation might not even happen. My guess would be that your code isn't even executed, the loop does nothing. Use JMH if you want a reliable microbenchmark. – Erik Dec 06 '21 at 09:50
  • Thanks for your suggestions! The problem is updated. Here comes other problems. – Jmc Dec 07 '21 at 09:20
  • I'm wondering. Is this actually a question, or simply a presentation of some interesting observations? (What would an "answer" to this look like?) Would this be better as a blog post? – Stephen C Dec 07 '21 at 10:03
  • OK, I updated the problem~ – Jmc Dec 07 '21 at 11:46
  • 1
    What does `-javaagent:…\idea_rt.jar` do? How can you be sure that the garbage collection is caused by your loop when there’s other software running in the same JVM at the same time? – Holger Dec 07 '21 at 11:52
  • `-javaagent:…\idea_rt.jar` is added automatically by idea, and now I enlarge the memory and add a control experiment. – Jmc Dec 07 '21 at 12:56
  • 1
    Very interesting. The `int hashcode = o.hashCode();` after the loop seems to be enough to make Escape Analysis ineffective here. But on the other hand, the reason why there’s no performance difference between enabling and disabling EA seems to be that trying to turn it off via `-XX:-DoEscapeAnalysis` has no effect at all. If I remove the problematic line, the allocations get eliminated whether the option is present or not. – Holger Dec 07 '21 at 17:12

2 Answers2

1

Escape Analysis is fragile. In your case, the statement int hashcode = o.hashCode(); after the loop is enough to make it ineffective. Calling the non-overridden hashCode() prevents an object from getting eliminated, as it is an identity-sensitive operation. But you further hit a limitation of the analyzer by using an object created within the loop body after the loop.

To simplify the test for the presence of Escape Analysis (or more generally, the presence of anything that eliminates allocations) and to prevent interference with Dead Code Elimination, I used

public class TestEA {
    public static void main(String[] args) {
        int result;
        if(args.length == 0 || !args[0].equals("NoUseAfterLoop")) {
            result = runEaTest();
        }
        else {
            result = runEaTestNoUseAfterLoop();
        }
        System.out.println(result);
    }
    private static int runEaTest() {
        record MyObject(int i) {};
        int sum = 0;
        MyObject o = null;
        for (int i = 1; i < 1_000_000_000; i++) {
            o = new MyObject(1);
            sum += o.i();
        }
        sum += o.i();
        return sum;
    }
    private static int runEaTestNoUseAfterLoop() {
        record MyObject(int i) {};
        int sum = 0;
        MyObject o = null;
        for (int i = 0; i < 1_000_000_000; i++) {
            o = new MyObject(1);
            sum += o.i();
        }
        return sum;
    }
}

This example uses the temporary object to store and retrieve a value which is used to calculate the returned sum. Therefore, the loop can’t be flatly removed.

When I ran this program via

jdk-17\bin\java -Xmx40m -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC TestEA

I got the expected

[0.007s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups
Terminating due to java.lang.OutOfMemoryError: Java heap space

while

jdk-17\bin\java -Xmx40m -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC TestEA NoUseAfterLoop

did not result in an OutOfMemoryError:

[0.007s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups
1000000000

Since creating 1,000,000,000 objects in a 64bit HotSpot JVM requires 16,000,000,000 bytes, and the EpsilonGC implies no attempt to re-use memory, running the loop with a 40M heap without an error indicates that allocations have been eliminated (and demonstrates a difference to the other method).

It’s the use of an object originating from the loop body which inhibits Escape Analysis here. But this example executes the method a single time only. The behavior may change again after the first execution, i.e. when all branches have been taken at least once.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • I added `int hashcode = o.hashcode();` because I must make sure the for loop will be executed, without this statement, JVM may deletes the whole for loop at runtime because it's useless. So your tests just showed that JVM did delete the `useless code` at runtime and EA just didn't work. – Jmc Dec 08 '21 at 04:50
  • But for one point you're right, maybe `hashcode()` method inhibits EA. I'll test this further. – Jmc Dec 08 '21 at 08:36
  • 1
    Yes, `hashcode()` does indeed inhibit EA, but the flow analysis limitation still applies, after all, it’s the same optimizer whether it does Dead Code Elimination or Escape Analysis. But you’re right, the reason why `-XX:-DoEscapeAnalysis` had no effect seems to be that JDK 15 introduced a Dead Code Elimination that works even without Escape Analysis. I change the example to actually work with the object and return a result. Now, `-XX:-DoEscapeAnalysis` works as expected when the object is not used outside the loop body. – Holger Dec 08 '21 at 09:21
  • YES, then I concluded a correct conclusion and write a new answer below~ – Jmc Dec 08 '21 at 09:29
  • 1
    But your example still doesn’t produce a result, so it’s still hard to distinguish Dead Code Elimination from Escape Analysis. I think, my updated example producing and returning a result based on the object demonstrates better that EA can perform the desired operation without the memory. Mind that “*creates Objects on stack*” is a tenacious wrong description of what happens here. The object is eliminated completely, its field is treated like a local variable, which may end up at a CPU register, or get replaced by the value used to initialize it. In my example, it could just predict the result – Holger Dec 08 '21 at 09:34
  • Yes, I set the stack memory to 180K(Min Value), then at runtime, Stack Overflow Exception didn't occurs. – Jmc Dec 08 '21 at 10:06
0

The escape analysis (EA) works in some conditions:

  1. Objects declared in method can't be exposed to any other threads, methods, objects.
  2. Objects declared in method can't invoke native method.
  3. The method should be invoked multiple times then EA will work.

The following test cases will prove EA works.

Code

/**
 * @author Jmc
 */
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 3, time = 10)
@Measurement(iterations = 3, time = 10)
@Fork(1)
public class EscapeAnalysisTest {
    static class Int {
        private final int value;

        public Int(int value) {
            this.value = value;
        }

        public int hash() {
            return value;
        }
    }

    /**
     * VM Options: -Xlog:gc:withoutEA.gc.log -Xmx30M -XX:-DoEscapeAnalysis
     */
    @Benchmark
    public void withoutEA() {
        Int n = null;
        for (int i = 0; i < 1_000_000_000; i++) {
            n = new Int(i);
        }

        // ensure for loop is executed
        int hash = n.hash();
    }

    /**
     * VM Options: -Xlog:gc:withEA.gc.log -Xmx30M
     */
    @Benchmark
    public void withEA() {
        Int n = null;
        for (int i = 0; i < 1_000_000_000; i++) {
            n = new Int(i);
        }
        
        // ensure for loop is executed
        int hash = n.hash();
    }
}

JMH Result

  • Without EA
# JMH version: 1.32
# VM version: JDK 17, Java HotSpot(TM) 64-Bit Server VM, 17+35-LTS-2724
# VM invoker: D:\Programs\JDK\jdk-17\bin\java.exe
# VM options: -Xlog:gc:withoutEA.gc.log -Xmx30M -XX:-DoEscapeAnalysis --enable-preview -javaagent:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\lib\idea_rt.jar=62639:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint
# Warmup: 3 iterations, 10 s each
# Measurement: 3 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.test.jmh.EscapeAnalysisTest.withoutEA

...

Benchmark                     Mode  Cnt           Score           Error  Units
EscapeAnalysisTest.withoutEA  avgt    3  3829083588.889 ± 204309555.709  ns/op
  • With EA
# JMH version: 1.32
# VM version: JDK 17, Java HotSpot(TM) 64-Bit Server VM, 17+35-LTS-2724
# VM invoker: D:\Programs\JDK\jdk-17\bin\java.exe
# VM options: -Xlog:gc:withEA.gc.log -Xmx30M --enable-preview -javaagent:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\lib\idea_rt.jar=62630:D:\Programs\JetBrains\IntelliJ IDEA 2021.2.2\bin -Dfile.encoding=UTF-8
# Blackhole mode: full + dont-inline hint
# Warmup: 3 iterations, 10 s each
# Measurement: 3 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.test.jmh.EscapeAnalysisTest.withEA

...

Benchmark                  Mode  Cnt  Score   Error  Units
EscapeAnalysisTest.withEA  avgt    3  0.659 ± 0.442  ns/op

GC Log

  • Without EA
[0.013s][info][gc] Using G1
[0.028s][info][gc] Using G1
[0.524s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 13M->2M(30M) 4.725ms
[0.531s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 14M->2M(30M) 3.718ms
[0.534s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 15M->2M(30M) 0.345ms

...

[68.030s][info][gc] GC(15298) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.189ms
[68.035s][info][gc] GC(15299) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.272ms
[68.039s][info][gc] GC(15300) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.132ms
[68.044s][info][gc] GC(15301) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.044ms
[68.049s][info][gc] GC(15302) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 1.184ms
  • With EA
[0.011s][info][gc] Using G1
[0.009s][info][gc] Using G1
[0.553s][info][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 13M->2M(30M) 5.623ms
[0.559s][info][gc] GC(1) Pause Young (Normal) (G1 Evacuation Pause) 14M->2M(30M) 3.804ms
[0.562s][info][gc] GC(2) Pause Young (Normal) (G1 Evacuation Pause) 15M->2M(30M) 0.309ms

...

[7.185s][info][gc] GC(1695) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.638ms
[7.189s][info][gc] GC(1696) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.524ms
[7.193s][info][gc] GC(1697) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.572ms
[7.196s][info][gc] GC(1698) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.370ms
[7.200s][info][gc] GC(1699) Pause Young (Normal) (G1 Evacuation Pause) 20M->2M(30M) 0.542ms

From the test result, you can see:

  1. The running time of With EA is much less than Without EA
  2. After withEA() is invoked multiple times (For this test is after 7.2s), the EA works and then GC does't occurs any more. On the contrary withoutEA() occurs GC the whole time.

So the conclusion is EA works and take some measures to reduce GC (maybe creates Objects on stack or destroy Objects on heap at once or something), it makes java program runs more efficiently.

Jmc
  • 1
  • 3