2

I was trying to observe the effects of CPU cache spatial locality by benchmarking sequential/random reads to an array with JMH. Interestingly, the results are almost the same.

So I wonder, is this the correct JMH approach?

Below is the test class I have used

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(MyBenchmark.N)
public class MyBenchmark {

    /*
     * # JMH version: 1.21
     * # VM version: JDK 1.8.0_91, Java HotSpot(TM) 64-Bit Server VM, 25.91-b15
     * # VM invoker: D:\jdk1.8.0_91\jre\bin\java.exe
     * # VM options: <none>
     * # Warmup: 5 iterations, 10 s each
     * # Measurement: 5 iterations, 10 s each
     * # Timeout: 10 min per iteration
     * # Threads: 1 thread, will synchronize iterations
     * # Benchmark mode: Average time, time/op
     * 
     * Benchmark                 Mode  Cnt  Score   Error  Units
     * MyBenchmark.randomAccess  avgt   25  7,930 ± 0,378  ns/op
     * MyBenchmark.serialAccess  avgt   25  7,721 ± 0,081  ns/op
     */
    static final int N = 1_000;

    @State(Scope.Benchmark)
    public static class Common {

        int[] data = new int[N];
        int[] serialAccessOrder = new int[N];
        int[] randomAccessOrder = new int[N];

        public Common() {
            Random r = new Random(11234);
            for (int i=0; i<N; i++) {
                data[i] = r.nextInt(N);
                serialAccessOrder[i] = i;
                randomAccessOrder[i] = data[i];
            }
        }
    }

    @Benchmark
    public void serialAccess(Blackhole bh, Common common) {
        for (int i=0; i<N; i++) {
            bh.consume(common.data[common.serialAccessOrder[i]]);
        }
    }

    @Benchmark
    public void randomAccess(Blackhole bh, Common common) {
        for (int i=0; i<N; i++) {
            bh.consume(common.data[common.randomAccessOrder[i]]);
        }
    }
}

Update: Turns out N was too small (1_000 * 4 bytes/int ~= 4KB) most likely the entire array was cached. Increasing N to 1_000_000 yields more intuitive results:

Benchmark                 Mode  Cnt   Score   Error  Units
MyBenchmark.randomAccess  avgt   25  20,426 ± 0,678  ns/op
MyBenchmark.serialAccess  avgt   25   6,762 ± 0,252  ns/op
Holger
  • 285,553
  • 42
  • 434
  • 765
pistolPanties
  • 1,880
  • 13
  • 18
  • Not an expert here, but I think that when you call bh.consume(something), the something is already loaded, so in the consume the difference of time already dissapeared. – LeDYoM Mar 18 '19 at 22:11
  • @LeDYoM just figured out why and updated the question – pistolPanties Mar 18 '19 at 22:40
  • to be honest you can post your own answer and finding, since to me this is correct. – Eugene Mar 19 '19 at 18:47

0 Answers0