28

If mem is a shared memory location, do I need:

XCHG EAX,mem

or:

LOCK XCHG EAX,mem

to do the exchange atomically?

Googling this yields both yes and no answers. Does anyone know this definitively?

Bernard
  • 45,296
  • 18
  • 54
  • 69
Walter Bright
  • 4,277
  • 1
  • 23
  • 28
  • Related: [Does lock xchg have the same behavior as mfence?](https://stackoverflow.com/q/40409297) (regular `xchg` is the same as `lock xchg` on 386 and newer.) – Peter Cordes Sep 03 '22 at 18:55

3 Answers3

35

Intel's documentation seems pretty clear that it is redundant.

IA-32 Intel® Architecture Software Developer’s Manual Volume 3A: System Programming Guide, Part 1

7.1.2.1 says:

The operations on which the processor automatically follows the LOCK semantics are as follows:

  • When executing an XCHG instruction that references memory.

Similarly,

Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B: Instruction Set Reference, N-Z

XCHG:

If a memory operand is referenced, the processor’s locking protocol is automatically implemented for the duration of the exchange operation, regardless of the presence or absence of the LOCK prefix or of the value of the IOPL.

Note that this doesn't actually meant that the LOCK# signal is asserted whether or not the LOCK prefix is used, 7.1.4 describes how on later processors locking semantics are preserved without a LOCK# if the memory location is cached. Clever, and definitely over my head.

CB Bailey
  • 755,051
  • 104
  • 632
  • 656
  • The PrintAssembly Option on the Oracle Hotspot JVM also seems to agree with this. When generating assembly, it does _not_ have lock prefix on xchg instruction on x86-64. – Kedar Mhaswade Nov 13 '13 at 22:29
  • The section is `8.1.2.1 Automatic Locking` in current verison of the manual, not 7.1.2.1. – Ruslan Mar 05 '19 at 16:10
  • 2
    The "cache lock" idea is that while this core has exclusive ownership (MESI) of a cache line, everything it does to that line might as well be a single atomic transaction from the PoV of everything else in the system that respects cache coherency, i.e. everything. So the core just has to delay responding to MESI share and invalidate requests for that line between the load and the store operations of an atomic RMW instruction. (And because of x86's strong memory-ordering rules, only do this after making earlier loads+stores visible, etc. so it's a full barrier.) – Peter Cordes Sep 03 '22 at 18:36
13

Since 386 days, xchg will assert the Lock signal whether or not you put the lock prefix on it. Intel's documentation covers this quite clearly in IA-32 instruction set reference N-Z.

Yann Vernier
  • 15,414
  • 2
  • 28
  • 26
5

As per the 80386 Instruction Manual, BUS LOCK is asserted for the duration of the exchange. The LOCK prefix has no precedence for this operation and neither does the value of the I/O Privilege Level.

My suggestion is that since the documentation states that BUS LOCK is asserted regardless of the presence of the LOCK prefix, LOCK XCHG EAX, mem is otherwise safe. When in doubt, add a LOCK.

Scott S. McCoy
  • 1,117
  • 11
  • 11
  • 1
    `lock xchg` is only needed if your 16-bit code might run on a 286 or earlier, but yes it seems this implicit `lock` was new in 386. At least newly documented, IDK about real behaviour of earlier CPUs. – Peter Cordes Sep 03 '22 at 18:38