9

I just learned of interlocked class and that it is supposed to be faster than simply locking. Now, this is all nice and good, but I'm curious as to implementation.

As far as I know, the only way to ensure that operation on a variable is done atomically is to ensure that only one thread can access that variable at any moment in time. Which is locking.

I've used reflector to get the source of Interlocked, but it seems that it uses external method to do all its work:

[MethodImpl(MethodImplOptions.InternalCall), ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
internal static extern int ExchangeAdd(ref int location1, int value);

I've run some tests, and Interlocked in fact is twice as fast as simply lock the object and increment it.

How are they doing it?

spender
  • 117,338
  • 33
  • 229
  • 351
Arsen Zahray
  • 24,367
  • 48
  • 131
  • 224

1 Answers1

11

Interlocked has support at the CPU level, which can do the atomic operation directly.

For example, Interlocked.Increment is effectively an XADD, and compare and swap (ie: Interlocked.CompareExchange) is supported via the CMPXCHG instructions (both with a LOCK prefix).

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • the funny thing is - I just run a test Interlocked vs ++. Interlocked seems faster – Arsen Zahray Sep 05 '13 at 15:37
  • @ArsenZahray How are you benchmarking? If you're doing it in debug, or under the VS test host, your benchmark will be highly skewed. – Reed Copsey Sep 05 '13 at 15:37
  • I run each operation 100000000 times. And I just ran my program from the cmd, and the result didn't change (lock: 00:00:02.4291389, interlocked: 00:00:01.1740671; ++:00:00:01.4320819 – Arsen Zahray Sep 05 '13 at 15:39
  • @ArsenZahray Release build? x86 or x64? – Reed Copsey Sep 05 '13 at 15:41
  • yeah! I build it for "any cpu", which on my pc is x64 – Arsen Zahray Sep 05 '13 at 15:43
  • @ArsenZahray My experience has been that it's a bit slower than ++, but I've only benchmarked with actual work occurring - there may be some JIT optimizations in place you're hitting that change depending on the operation you use, etc. – Reed Copsey Sep 05 '13 at 15:45
  • But how does the CPU deal with larger operations? For example if you need to use `Interlocked` to move a value that is larger than the CPU word? Does it prevent the CPU from taking other work at the time, does that mean it blocks? – J. Doe Jun 15 '18 at 11:09
  • 1
    @J.Doe Interlocked doesn't support values that are too large for the CPU to move atomically. – Reed Copsey Jun 18 '18 at 20:06
  • O-o-oh, shiny! Does that mean it's limited by the processor word? Where can I read about that limitation, my Google-fu failed me on this one. – J. Doe Jun 19 '18 at 08:11
  • 2
    @J.Doe Check the available API here: https://msdn.microsoft.com/en-us/library/system.threading.interlocked_methods(v=vs.110).aspx Also read the notes, which mention it: "The Read method and the 64-bit overloads of the Increment, Decrement, and Add methods are truly atomic only on systems where a System.IntPtr is 64 bits long. " – Reed Copsey Jun 19 '18 at 18:24
  • I've tried to read the source code [1], but I couldn't find the actual source code of `ExchangeAdd`. Do you know where I can find it? [1] https://github.com/microsoft/referencesource/blob/master/mscorlib/system/threading/interlocked.cs#L233 – Akbari Jun 09 '20 at 15:49