3

What is the difference between LDAXR & LDXR instructions out of AArch64 instruction set?

From reference manual they looks totally the same (with exception of 'acquire' word):

LDAXR - Load-Acquire Exclusive Register: loads word from memory addressed by base to Wt. Records the physical address as an exclusive access.

LDXR - Load Exclusive Register: loads a word from memory addressed by base to Wt. Records the physical address as an exclusive access.

Thanks

user3124812
  • 1,861
  • 3
  • 18
  • 39

1 Answers1

3

In the simplest form, LDAEX == LDXR +DMB_SY.

This is the description which I find for LDAXR:

C6.2.104 LDAXR

Load-Acquire Exclusive Register derives an address from a base register value, loads a 32-bit word or 64-bit doubleword from memory, and writes it to a register. The memory access is atomic. The PE marks the physical address being accessed as an exclusive access. This exclusive access mark is checked by Store Exclusive instructions. See Synchronization and semaphores on page B2-135. The instruction also has memory ordering semantics as described in Load-Acquire, Load-AcquirePC, and Store-Release on page B2-108. For information about memory accesses see Load/Store addressing modes on page C1-157.

From section K11.3 of DDI0487 Da

The ARMv8 architecture adds the acquire and release semantics to Load-Exclusive and Store-Exclusive instructions, which allows them to gain ordering acquire and/or release semantics. The Load-Exclusive instruction can be specified to have acquire semantics, and the Store-Exclusive instruction can be specified to have release semantics. These can be arbitrarily combined to allow the atomic update created by a successful Load-Exclusive and Store-Exclusive pair to have any of:

  • No Ordering semantics (using LDREX and STREX).

  • Acquire only semantics (using LDAEX and STREX).

  • Release only semantics (using LDREX and STLEX).

  • Sequentially consistent semantics (using LDAEX and STLEX).

Also (B2.3.5),

The basic principle of a Load-Acquire instruction is to introduce order between the memory access generated by the Load-Acquire instruction and the memory accesses appearing in program order after the Load-Acquire instruction, such that the memory access generated by the Load-Acquire instruction is Observed-by each PE, to the extent that that PE is required to observe the access coherently, before any of the memory accesses appearing in program order after the Load-Acquire instruction are Observed-by that PE, to the extent that the PE is required to observe the accesses coherently.

Sean Houlihane
  • 1,698
  • 16
  • 22
  • Thanks, I searched by an instruction mnemonic and missed that part. – user3124812 Jan 03 '19 at 23:44
  • 1
    So just to summarise, that's only about memory ordering and simply speaking `LDAEX` == `LDXR + DMB_SY` – user3124812 Jan 03 '19 at 23:46
  • 1
    Note that the internal implementation does *not* have to include a barrier for *everything* like `DMB_SY`. A high-performance implementation may only make sure it happens after any preceding release-stores, but allow reordering with previous plain stores. That's all that's needed to make the operation itself `seq_cst`. (This makes it possible for seq_cst on AArch64 to be significantly less expensive than on other machines, if you avoid doing a seq_cst load right after a seq_cst store or RMW.) – Peter Cordes Apr 05 '21 at 18:02