I'm new to JHM, to test it out I wrote the following simple method
@Benchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public long sumPar(){
return
LongStream.rangeClosed(1, LIMIT)
.parallel()
.sum();
}
The results of the test varied greatly
# JMH 1.10.3 (released 8 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: /usr/lib/jvm/java-8-oracle/jre/bin/java
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: testStreams.Bench.sumPar
# Run progress: 22.21% complete, ETA 00:02:33
# Fork: 1 of 1
# Warmup Iteration 1: 53.495 ops/s
# Warmup Iteration 2: 57.971 ops/s
# Warmup Iteration 3: 57.622 ops/s
# Warmup Iteration 4: 58.113 ops/s
# Warmup Iteration 5: 57.861 ops/s
Iteration 1: 50.414 ops/s
Iteration 2: 9.207 ops/s
Iteration 3: 9.176 ops/s
Iteration 4: 9.212 ops/s
Iteration 5: 9.175 ops/s
The same behaviour was observed after multiple runs. After reducing the warm up iterations I still saw a drop in ops/s after 5/6 iterations. Other Benchmarks in the test suite which used parallel operations performed consistently.
I'm new to JMH, I have a few questions.
- Are there any configuration parameters in JMH I could tune to get less variance?
- Is this the expected behaviour of multiple iterations of the method in question?
- If JHM is reporting correctly how might I go about debugging this method and finding out why the ops/s drop?
Some specs
intel i7-4810MQ @ 2.80GHz
16GB RAM
Ubuntu 14.04 running on virtual box on windows host
virtual machine is alloacted 3 CPU cores
JDK 1.8.0_45
JMH 1.10.3
EDIT
Thanks for all the feedback, much appreciated.
I re-ran the benchmark this morning having left my laptop off all night. The inconsistent behaviour has completely disappeared. I upped the iterations to 1000 and still the ops/s remained consistent.
Checking CPU temp it was stable at 84 degrees.
I have not been able to reproduce this issue, next time I think my CPU may be overheating I'd like to re-run this benchmark and monitor CPU temp to see if this behaviour re-occurs.
Using -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining
when running did show similar compilation patterns to the method in Erratic performance of Arrays.stream().map().sum(), however I do not think that was the root cause of this issue.
EDIT 2
Was able to reproduce and solved by adding -XX:MaxInlineLevel=12
flag.
DevBox:~/test$ java -jar target/benchmarks.jar testStreams.Bench -i 5 -wi 5 -f 1
# JMH 1.10.3 (released 8 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: /usr/lib/jvm/java-8-oracle/jre/bin/java
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: testStreams.Bench.sumPar
# Run progress: 0.00% complete, ETA 00:00:30
# Fork: 1 of 1
# Warmup Iteration 1: 53.672 ops/s
# Warmup Iteration 2: 57.720 ops/s
# Warmup Iteration 3: 58.320 ops/s
# Warmup Iteration 4: 58.174 ops/s
# Warmup Iteration 5: 58.680 ops/s
Iteration 1: 49.810 ops/s
Iteration 2: 9.109 ops/s
Iteration 3: 9.427 ops/s
Iteration 4: 9.437 ops/s
Iteration 5: 9.436 ops/s
DevBox:~/test$ java -XX:MaxInlineLevel=12 -jar target/benchmarks.jar testStreams.Bench -i 1000 -wi 5 -f 1
# JMH 1.10.3 (released 8 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: /usr/lib/jvm/java-8-oracle/jre/bin/java
# VM options: -XX:MaxInlineLevel=12
# Warmup: 5 iterations, 1 s each
# Measurement: 1000 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: testStreams.Bench.sumPar
# Run progress: 0.00% complete, ETA 00:50:16
# Fork: 1 of 1
# Warmup Iteration 1: 53.888 ops/s
# Warmup Iteration 2: 58.328 ops/s
# Warmup Iteration 3: 58.468 ops/s
# Warmup Iteration 4: 58.455 ops/s
# Warmup Iteration 5: 57.937 ops/s
Iteration 1: 58.717 ops/s
Iteration 2: 59.494 ops/s
Iteration 3: 60.013 ops/s
Iteration 4: 59.506 ops/s
Iteration 5: 51.543 ops/s
Still haven't found root cause as to why I couldn't reproduce the issue when -XX:MaxInlineLevel=12
was omitted before. As far as I can see I was using the same settings. I may try run benchmark the again after my laptop has been left idle for a while, but for now am happy I gained some insight into JIT inlining.