11

I can't find the implementation of AtomicCmpExchange (seems to be hidden), so I don't know what it does.

Is AtomicCmpExchange reliable on all platforms? How is it implemented internally? Does it use something like a critical section?

I have this scenario :

MainThread:

Target := 1;

Thread1:

x := AtomicCmpExchange(Target, 0, 0);

Thread2:

Target := 2;

Thread3:

Target := 3;

Will x always be an integer 1, 2 or 3, or could it be something else? I mean, even if the AtomicCmpExchange(Target, 0, 0) failed to exchange the value, does it return a "valid" integer (I mean, not a half-read integer, for exemple if another thread has already started to half write of the value)?

I want to avoid using a critical section, I need maximum speed.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
zeus
  • 12,173
  • 9
  • 63
  • 184

1 Answers1

17

AtomicCmpExchange is what is known as an intrinsic routine, or a standard function. It is intrinsically known to the compiler and may or may not have a visible implementation. For example, Writeln is a standard function, but you won't find a single implementation for it. The compiler breaks it up into multiple calls to lower-level functions in System.pas. Some standard functions, such as Inc() and Dec() don't have any implementation in System.pas. The compiler will generate machine instructions which amount to simple INC or DEC instructions.

Like Inc() or Dec(), AtomicCmpExchange() is implemented using whatever code is needed for a given platform. It will generate inline instructions. For x86/x64 it will generate a CMPXCHG instruction (along with whatever setup is necessary to get variables/values into the registers). For ARM it will generate a few more instructions around the LDREX and STREX instructions.

So the direct answer to your question is that even calling into assembly code, you cannot get much more efficient than using that standard function along with others such as AtomicIncrement, AtomicDecrement, and AtomicExchange.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
Allen Bauer
  • 16,657
  • 2
  • 56
  • 74
  • 2
    Yes, it will always return a valid integer. If the value of the target matches the comparand, the value is swapped and the return value is the target's previous value. If the target doesn't match, then the value is returned. All of this is done atomically, which means it is guaranteed to complete without any worry of intermediate states messing it up. – Allen Bauer Oct 16 '16 at 20:16
  • 3
    It's also worth noting that the variables must be aligned – David Heffernan Oct 16 '16 at 20:49
  • 1
    Microsoft says that alignment is required (for both x86 and x64) but Intel seems to disagree: http://stackoverflow.com/questions/1415256/alignment-requirements-for-atomic-x86-instructions – Alexandre M Oct 16 '16 at 21:58
  • @Alexandre And what about the mobile targets which run on ARM? Even on Intel there are performance reasons that mean aligned access is preferable. – David Heffernan Oct 17 '16 at 06:06
  • 1
    @David Hefferman: So the correct answer is: - For x86 and x64 alignment is preferred but not mandatory – Alexandre M Oct 17 '16 at 09:13
  • @Alexandre The question explicitly states interest in "all platforms". Different platforms have different requirements. It's true that x86 and x64 `LOCK` allows you to straddle cache lines, at the cost of performance. But the requirements on ARM are different. – David Heffernan Oct 17 '16 at 09:19