Disclaimer: I read this article by Alexey Shipilev and understood that nanobenchmarks a kind of evil. But anyway want to experiment and understand by myself.
I'm trying to measure array creation vs boxing of byte
. Here is my benchmark:
@Fork(1)
@Warmup(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
@Measurement(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
public class MyBenchmark {
@Benchmark
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
public void arrayBenchmark(Blackhole bh) {
byte[] b = new byte[1];
b[0] = 20;
bh.consume(b);
}
@Benchmark
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
public void bonxingBenchmark(Blackhole bh) {
bh.consume(new Byte((byte) 20));
}
}
I ran this benchmark several times and for some reason I figured out that boxing is 1.5 times faster than creation of array and putting element into it
So I decided to run this benchmarks with -prof gc
. The result is this:
MyBenchmark.arrayBenchmark avgt 5 7.751 ± 0.537 ns/op
MyBenchmark.arrayBenchmark:·gc.alloc.rate avgt 5 1966.743 ± 143.624 MB/sec
MyBenchmark.arrayBenchmark:·gc.alloc.rate.norm avgt 5 24.000 ± 0.001 B/op
MyBenchmark.arrayBenchmark:·gc.churn.PS_Eden_Space avgt 5 1966.231 ± 326.378 MB/sec
MyBenchmark.arrayBenchmark:·gc.churn.PS_Eden_Space.norm avgt 5 23.999 ± 4.148 B/op
MyBenchmark.arrayBenchmark:·gc.churn.PS_Survivor_Space avgt 5 0.042 ± 0.113 MB/sec
MyBenchmark.arrayBenchmark:·gc.churn.PS_Survivor_Space.norm avgt 5 0.001 ± 0.001 B/op
MyBenchmark.arrayBenchmark:·gc.count avgt 5 37.000 counts
MyBenchmark.arrayBenchmark:·gc.time avgt 5 48.000 ms
MyBenchmark.bonxingBenchmark avgt 5 6.123 ± 1.306 ns/op
MyBenchmark.bonxingBenchmark:·gc.alloc.rate avgt 5 1664.504 ± 370.508 MB/sec
MyBenchmark.bonxingBenchmark:·gc.alloc.rate.norm avgt 5 16.000 ± 0.001 B/op
MyBenchmark.bonxingBenchmark:·gc.churn.PS_Eden_Space avgt 5 1644.547 ± 1004.476 MB/sec
MyBenchmark.bonxingBenchmark:·gc.churn.PS_Eden_Space.norm avgt 5 15.769 ± 7.495 B/op
MyBenchmark.bonxingBenchmark:·gc.churn.PS_Survivor_Space avgt 5 0.037 ± 0.067 MB/sec
MyBenchmark.bonxingBenchmark:·gc.churn.PS_Survivor_Space.norm avgt 5 ≈ 10⁻³ B/op
MyBenchmark.bonxingBenchmark:·gc.count avgt 5 23.000 counts
MyBenchmark.bonxingBenchmark:·gc.time avgt 5 37.000 ms
As we can see, the GC
is heavily loaded in arrayBenchmark
case. Allocation rate 1966
vs 1664
. gc-count
and gc-time
also differs. I think that's the cause, but not sure
For now I don't quite understand that behaviour. I thought array allocation in my case just means that we allocate 1 byte somewhere. To me it looks pretty much the same as Boxing
, but actually different.
Can you help me to understand it?
And what's most important... Can I trust this benchmark?