GCC implements __sync_val_compare_and_swap
on PowerPC[64] as:
sync
1: lwarx 9,0,3
cmpw 0,9,4
bne 0,2f
stwcx. 5,0,3
bne 0,1b
2: isync
GCC documents for the __sync_*
builtins:
In most cases, these builtins are considered a full barrier. That is, no memory operand will be moved across the operation, either forward or backward. Further, instructions will be issued as necessary to prevent the processor from speculating loads across the operation and from queuing stores after the operation.
However the use of isync
rather than sync
at the end is bothering me. Is this actually a full barrier? Or:
Could loads performed after the
__sync_val_compare_and_swap
fail to see stores performed before the store that produced the value__sync_val_compare_and_swap
loaded?Could stores performed after the
__sync_val_compare_and_swap
be seen by other threads before they see the value stored by the__sync_val_compare_and_swap
?