1

I wrote the cas code (while loop for compare_and_set) by manual instead of direct invocation of Unsafe.getAndAddInt method. But when I use jmh to test performance, it shows the big performance lost although i wrote the same code as just copy of the source code of Unsafe method. Who can help me what makes this big difference? Thanks in advance.

The jmh result is:

Benchmark              Mode  Cnt  Score   Error  Units
CASTest.casTest        avgt       0.047          us/op
CASTest.manualCasTest  avgt       0.137          us/op  

the source code is:

package org.sample;

import java.lang.reflect.Field;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import sun.misc.Unsafe;

/**
 * @author Isaac Gao
 * @Date 2020/2/20
 */
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@State(Scope.Benchmark)
@Threads(2)
@Measurement(iterations = 1, time = 1, timeUnit = TimeUnit.SECONDS)
@Warmup(iterations = 2, time = 1)
@Fork(1)
public class CASTest {

  private static Unsafe getUnsafe() {
    try {
      final Field unsafeField = Unsafe.class.getDeclaredField("theUnsafe");

      unsafeField.setAccessible(true);
      return (Unsafe) unsafeField.get(null);
    } catch (NoSuchFieldException | IllegalAccessException e) {
      e.printStackTrace();
    }
    return null;
  }
  private static final Unsafe unsafe = getUnsafe();

  private static final long valueOffset;
  static {
    try {
      valueOffset = unsafe.objectFieldOffset
          (CASTest.class.getDeclaredField("value"));
    } catch (Exception ex) { throw new Error(ex); }
  }

  private volatile  int value;

  @Benchmark
  public void manualCasTest(Blackhole bh) {
    int andAddIntManually = getAndAddIntManually(this, valueOffset, 1);
    bh.consume(andAddIntManually);
  }

  @Benchmark
  public void casTest(Blackhole  bh) {
    int andAddInt = unsafe.getAndAddInt(this, valueOffset, 1);
    bh.consume(andAddInt);
  }

  public final int getAndAddIntManually(Object o, long offset, int delta) {
    int v;
    do {
      v = unsafe.getIntVolatile(o, offset);
    } while (!unsafe.compareAndSwapInt(o, offset, v, v + delta));
    return v;
  }

  public static void main(String[] args) throws RunnerException {
    Options opt = new OptionsBuilder()
        .include(CASTest.class.getSimpleName())
        .build();
    new Runner(opt).run();
  }
}

Holger
  • 285,553
  • 42
  • 434
  • 765
Isaac Gao
  • 11
  • 3

1 Answers1

2

The executed code doesn’t necessarily match what you’ve seen in the source code. A similar nonmatching performance of copy&pasted code has been discussed in Does Java JIT cheat when running JDK code?

Well known methods may get replaced by special implementations, regardless of whether there original declaration was native or had a pure Java implementation. See also What does 'intrinsify' mean in the JVM source code?

When we look in the JVM source file vmSymbols.hpp, line 1031, we will see that sun.misc.Unsafe.getAndAddInt is known to the JVM.

You can use
-XX:CompileCommand=print,CASTest.casTest -XX:CompileCommand=print,CASTest.manualCasTest
to check the resulting native code (which is generally a good idea for evaluating benchmark results).

On X64, you’ll see that manualCasTest will be compiled like you’ve written, a loop centered around a
lock cmpxchg dword ptr [rsi],ebx instruction, whereas the casTest bears a single loop free
lock xadd dword ptr [rdx+0ch],r8d instruction (details may vary).

Holger
  • 285,553
  • 42
  • 434
  • 765
  • 1
    See also [How to see JIT-compiled code in JVM?](https://stackoverflow.com/a/15146962/2711488) – Holger Feb 20 '20 at 18:59