6

I've got a very simple micro-benchmark

@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
public class Test {

  List<Integer> list =  new Random().ints(100_000).boxed().collect(toList());

  @Benchmark public int mapToInt() {
    return list.stream().mapToInt(x -> x * x).sum();
  }
}

When I run it, I always get a result where the first warmup run is much faster than the next runs:

# Warmup Iteration   1: 171.596 us/op
# Warmup Iteration   2: 689.337 us/op
....
Iteration   1: 677.625 us/op
....

Command line:

java -jar target/benchmarks.jar .*Test.* -wi 5 -w 1000ms -i 10 -r 1000ms -t 1 -f 5 -tu us

Playing with the number of forks or threads does not seem to make a difference.

So it looks like some optimisation gets reverted but I can't find what it is.

Is the degraded performance due to an issue with my benchmark or is this de-optimisation representative of what would happen in a real application?

Note: this is a follow-up of Is there any advantage of calling map after mapToInt, where ever required?

Community
  • 1
  • 1
assylias
  • 321,522
  • 82
  • 660
  • 783
  • 2
    I believe it's the same issue as [here](http://stackoverflow.com/a/25851390/4856258). – Tagir Valeev Sep 09 '15 at 10:33
  • You may well have a deoptimisation or recompilation - that's why warmup phases are important, to let that happen before ([see rule 5](https://wiki.openjdk.java.net/display/HotSpot/MicroBenchmarks)) the final measurements. As for "*is this de-optimisation representative of what would happen in a real application*": it is a micro-benchmark, it quite possibly isn't at all representative of what will happen in a real world application by the time you have filled up your CPU cache's with other stuff, etc. etc. (although it may be useful in other ways). – Andy Brown Sep 09 '15 at 10:36
  • @TagirValeev It looks like it indeed. – assylias Sep 09 '15 at 10:37
  • 1
    @assylias, don't worry, this first-iteration performance is still unrelated to the performance in real application. In real application your type profile is polluted, thus full inlining would be impossible anyways. It's not so trivial to write proper JMH benchmark involving stream API. – Tagir Valeev Sep 09 '15 at 10:42
  • @TagirValeev The optimisation comes back after about 30 runs. Whether that would happen in the real world or not is indeed hard to tell... – assylias Sep 09 '15 at 10:43

0 Answers0