Refer to the existing discussion at AtomicInteger lazySet vs. set for the background of AtomicInteger.lazySet().
So according to the semantics of AtomicInteger.lazySet(), on x86 CPU, AtomicInteger.lazySet() is equivalent to a normal write operation to the value of the AotmicInteger, because the x86 memory model guarantees the order among write operations.
However, the runtime behavior for AtomicInteger.lazySet() is different between the interperter and the JIT compiler (the C2 compiler specifically) in the JDK 8 Hotspot JVM, which confuses me.
First, create a simple Java application for demo.
import java.util.concurrent.atomic.AtomicInteger;
public class App {
public static void main (String[] args) throws Exception {
AtomicInteger i = new AtomicInteger(0);
i.lazySet(1);
System.out.println(i.get());
}
}
Then, dump the instructions for AtomicInteger.lazySet() which are from the instrinc method provided by the C2 compiler:
$ java -Xcomp -XX:+UnlockDiagnosticVMOptions -XX:-TieredCompilation -XX:CompileCommand=print,*AtomicInteger.lazySet App
...
0x00007f1bd927214c: mov %edx,0xc(%rsi) ;*invokevirtual putOrderedInt
; - java.util.concurrent.atomic.AtomicInteger::lazySet@8 (line 110)
As you can see, the operation is as expected a normal write.
Then, use the GDB to trace the runtime behavior of the interpreter for AtomicInteger.lazySet().
$ gdb --args java App
(gdb) b Unsafe_SetOrderedInt
0x7ffff69ae836 callq 0x7ffff69b6642 <OrderAccess::release_store_fence(int volatile*, int)>
0x7ffff69b6642:
push %rbp
mov %rsp,%rbp
mov %rdi,-0x8(%rbp)
mov %esi,-0xc(%rbp)
mov -0xc(%rbp),%eax
mov -0x8(%rbp),%rdx
xchg %eax,(%rdx) // the write operation
mov %eax,-0xc(%rbp)
nop
pop %rbp
retq
s you can see, the operation is actually a XCHG instruction ,which has a implict lock semantics, which brings performance overhead that AtomicInteger.lazySet() is intended to eliminate.
Does anyone know why there is such a difference? thanks.