I learn 'Computer Organization and Design' RISC-V version by David A. Patterson, and on page 254 Elaboration have below code
below is book contents and related code:
While the code above implemented an atomic exchange, the following code would more efficiently acquire a lock at the location in register x20, where the value of 0 means the lock was free and 1 to mean lock was acquired:
addi x12, x0, 1
// copy locked value
again: lr.d x10, (x20)
// load-reserved to read lock
bne x10, x0, again
// check if it is 0 yet
sc.d x11, x12, (x20)
// attempt to store new value
bne x11, x0, again
// branch if store fails
which is changed from (based on) original after adding lock
Since the load-reserved returns the initial value, and the store-conditional returns 0 only if it succeeds, the following sequence implements an atomic exchange on the memory location specified by the contents of x20:
again:lr.d x10, (x20)
// load-reserved
sc.d x11, x23, (x20)
// store-conditional
bne x11, x0, again
// branch if store fails
addi x23, x10, 0
// put loaded value in x23
1- the book says addition of lock to the code by addi x12, x0, 1 // copy locked value
is 'more efficient' which I don't get where it is
2- I think this lock can't avoid 'spuriously fail' based on 'cache line' hardware design, am I right?