777

Let's say that a class has a public int counter field that is accessed by multiple threads. This int is only incremented or decremented.

To increment this field, which approach should be used, and why?

  • lock(this.locker) this.counter++;,
  • Interlocked.Increment(ref this.counter);,
  • Change the access modifier of counter to public volatile.

Now that I've discovered volatile, I've been removing many lock statements and the use of Interlocked. But is there a reason not to do this?

Sam
  • 7,252
  • 16
  • 46
  • 65
core
  • 32,451
  • 45
  • 138
  • 193
  • 1
    https://www.simple-talk.com/blogs/2012/01/24/inside-the-concurrent-collections-concurrentqueue/ you can see the use of volitable in arrays , i don't completly understand it , but it's anouther reference to what this does. – eran otzap Sep 28 '13 at 16:18
  • 63
    This is like saying "I've discovered that the sprinkler system is never activated, so I'm going to remove it and replace it with smoke alarms". The reason not to do this is *because it is incredibly dangerous* and *gives you almost no benefit*. If you have time to spend changing the code then **find a way to make it less multithreaded**! Don't find a way to make the multithreaded code more dangerous and easily broken! – Eric Lippert Feb 18 '14 at 17:25
  • 1
    My house has both sprinklers *and* smoke alarms. When incrementing a counter on one thread and reading it on another it seems like you need both a lock (or an Interlocked) *and* the volatile keyword. Truth? – yoyo Nov 05 '14 at 00:17
  • 2
    @yoyo No, you don't need both. – David Schwartz Oct 14 '16 at 21:17
  • Read the [Threading in C#](http://www.albahari.com/threading/) reference. It covers the ins and outs of your question. Each of the three have different purposes and side effects. – spoulson Sep 30 '08 at 19:33
  • https://www.harshmaurya.in/volatile-vs-lock-vs-interlocked-in-c-net/ An attempt of mine to explain the difference between all three. – Samarsh Mar 14 '22 at 04:05

10 Answers10

984

Worst (won't actually work)

Change the access modifier of counter to public volatile

As other people have mentioned, this on its own isn't actually safe at all. The point of volatile is that multiple threads running on multiple CPUs can and will cache data and re-order instructions.

If it is not volatile, and CPU A increments a value, then CPU B may not actually see that incremented value until some time later, which may cause problems.

If it is volatile, this just ensures the two CPUs see the same data at the same time. It doesn't stop them at all from interleaving their reads and write operations which is the problem you are trying to avoid.

Second Best:

lock(this.locker) this.counter++;

This is safe to do (provided you remember to lock everywhere else that you access this.counter). It prevents any other threads from executing any other code which is guarded by locker. Using locks also, prevents the multi-CPU reordering problems as above, which is great.

The problem is, locking is slow, and if you re-use the locker in some other place which is not really related then you can end up blocking your other threads for no reason.

Best

Interlocked.Increment(ref this.counter);

This is safe, as it effectively does the read, increment, and write in 'one hit' which can't be interrupted. Because of this, it won't affect any other code, and you don't need to remember to lock elsewhere either. It's also very fast (as MSDN says, on modern CPUs, this is often literally a single CPU instruction).

I'm not entirely sure however if it gets around other CPUs reordering things, or if you also need to combine volatile with the increment.

InterlockedNotes:

  1. INTERLOCKED METHODS ARE CONCURRENTLY SAFE ON ANY NUMBER OF COREs OR CPUs.
  2. Interlocked methods apply a full fence around instructions they execute, so reordering does not happen.
  3. Interlocked methods do not need or even do not support access to a volatile field, as volatile is placed a half fence around operations on given field and interlocked is using the full fence.

Footnote: What volatile is actually good for.

As volatile doesn't prevent these kinds of multithreading issues, what's it for? A good example is saying you have two threads, one which always writes to a variable (say queueLength), and one which always reads from that same variable.

If queueLength is not volatile, thread A may write five times, but thread B may see those writes as being delayed (or even potentially in the wrong order).

A solution would be to lock, but you could also use volatile in this situation. This would ensure that thread B will always see the most up-to-date thing that thread A has written. Note however that this logic only works if you have writers who never read, and readers who never write, and if the thing you're writing is an atomic value. As soon as you do a single read-modify-write, you need to go to Interlocked operations or use a Lock.

Community
  • 1
  • 1
Orion Edwards
  • 121,657
  • 64
  • 239
  • 328
  • 30
    "I'm not entirely sure ... if you also need to combine volatile with the increment." They cannot be combined AFAIK, as we can't pass a volatile by ref. Great answer by the way. – Hosam Aly Jan 17 '09 at 13:07
  • 48
    Thanx much! Your footnote on "What volatile is actually good for" is what I was looking for and confirmed how I want to use volatile. – Jacques Bosch May 10 '10 at 06:22
  • 3
    +1: CLR via C# has a section on volatile which boils down to precisely this. The term for the situation that volatile solves is called cache coherency (http://en.wikipedia.org/wiki/Cache_coherence). – Steven Evers Jun 01 '10 at 22:29
  • 7
    In other words, if a var is declared as volatile, the compiler will assume that the var's value will not remain the same (i.e. volatile) each time your code comes across it. So in a loop such as: while (m_Var) { }, and m_Var is set to false in another thread, the compiler won't simply check what's already in a register that was previously loaded with m_Var's value but reads the value out from m_Var again. However, it doesn't mean that not declaring volatile will cause the loop to go on infinitely - specifying volatile only guarantees that it won't if m_Var is set to false in another thread. – Zach Saw Jun 23 '11 at 07:41
  • @SnOrfus: Wrong. Volatile has nothing to do with cache coherency. The coherency is maintained regardless of what you do and is hardwired in the CPUs. See my reply above. – Zach Saw Jul 07 '11 at 01:26
  • 38
    @Zach Saw: Under the memory model for C++, volatile is how you've described it (basically useful for device-mapped memory and not a lot else). Under the memory model for the *CLR* (this question is tagged C#) is that volatile will insert memory barriers around reads and writes to that storage location. Memory barriers (and special locked variations of some assembly instructions) are you you tell the *processor* not to reorder things, and they're fairly important... – Orion Edwards Jul 08 '11 at 03:39
  • 4
    @Zach Saw: While you're right that on the majority of multi-cpu systems these days, and cache coherency will keep each cpu's data up to date, it *won't* account for the fact that cpu's can (and do) re-order instructions. You need memory barriers to prevent this, and as I mentioned above, volatile in the CLR emits memory barriers around every read/write to a volatile storage location... – Orion Edwards Jul 08 '11 at 03:49
  • @Orion: Memory barriers don't tell the processor not to reorder things... It simply waits until the store is complete. – Zach Saw Jul 08 '11 at 04:44
  • @Orion: You're wrong. Read the links I posted. Given x86/x64's strong memory model, volatile simply causes the compiler to *not* reorder instructions in a certain way. The rest of it still relies on x86/x64's memory model. – Zach Saw Jul 08 '11 at 04:45
  • @Orion: Yes, CLR's memory model is different from C++ -- in no way was I describing C++'s memory model. I was describing x86/x64's memory model. CLR runs on top of it, and when strong memory model is already guaranteed, why would the CLR want to do anything else to slow things down??? – Zach Saw Jul 08 '11 at 04:47
  • @Orion: You have to understand that re-ordering instructions does *not* affect the outcome of the store (and hence read from another processor) if the store is done in-order. A memory barrier ***after*** a store is sufficient for all intents and purposes. It guarantees that signalling a waiting thread will always see a committed value. C++ does *not* guarantee this with the volatile keyword - you have to do the barrier yourself. The base memory model guaranteed by the CLR is a weak memory model -- there's no *need* for the CLR to do memory barriers - it's vendor dependent. – Zach Saw Jul 08 '11 at 04:56
  • pragmatically volitile will at least improve his issue though. – Tom Fobear Dec 01 '11 at 20:33
  • 3
    @ZachSaw, sorry, but you are a little bit wrong. Check this article from the expert: http://blogs.msdn.com/b/ericlippert/archive/2011/06/16/atomicity-volatility-and-immutability-are-different-part-three.aspx - In C#, "volatile" means not only "make sure that the compiler and the jitter do not perform any code reordering or register caching optimizations on this variable". It also means "tell the processors to do whatever it is they need to do to ensure that I am reading the latest value, even if that means halting other processors and making them synchronize main memory with their caches". – zihotki Apr 29 '12 at 21:56
  • 3
    @zihotki: That's the simple version. Look at the comments - the author was asked specific questions about cache coherency and his answer was what I've been trying to explain, apparently to no avail to some... – Zach Saw May 03 '12 at 10:55
  • 2
    @ZachSaw sigh.. I've no doubt you know more than I do about coherency and re-ordering, but you seem miss the point that The question is a C# question, so we're talking about the *CLR* memory model, NOT the x86/64, or any other memory model – Orion Edwards May 08 '12 at 23:39
  • @Orion: And we're back to square one. You said CLR emits memory barriers around ops involving volatile vars. I said not necessarily (there are other ways of implementing it as seen in the x86 memory model). Is that clearer now? – Zach Saw May 10 '12 at 04:42
  • 1
    @ZachSaw surprisingly, yes :-) Cheers – Orion Edwards May 16 '12 at 21:01
  • 22
    @ZachSaw: A volatile field in C# prevents the C# compiler and jit compiler from making certain optimizations that would cache the value. It also makes certain guarantees about what order reads and writes may be observed to be in on multiple threads. As an implementation detail it may do so by introducing memory barriers on reads and writes. The precise semantics guaranteed are described in the specification; note that the specification does *not* guarantee that a *consistent* ordering of *all* volatile writes and reads will be observed by *all* threads. – Eric Lippert Dec 03 '13 at 17:22
  • 7
    @ZachSaw - just re-reading the comments from ages ago, and I realised this comment of mine was wrong - *"Under the memory model for the CLR, volatile will insert memory barriers around reads and writes to that storage location"* - The CLR will insert memory barriers for some volatile operations *on ARM*, due to it's weaker memory model - but you're right, it doesn't do this for x86 because it doesn't need to. My apologies – Orion Edwards Jan 08 '14 at 19:26
  • 1
    @OrionEdwards After I read thought all the comments, I am not sure whether your accepted answer is correct. Could you help and add supplementary to your answer? – Mickey Jan 29 '14 at 07:53
  • 1
    @OrionEdwards That is indeed the CLR model (MSFT's implementation of CLI). However, the ECMA CLI specs a weak memory model so its implementation is really up to the vendor. But for practical purposes, let's just leave that as the conclusion. From an academics point of view, the ECMA CLI specs is a wreck. – Zach Saw Feb 14 '14 at 02:52
  • @EricLippert Which specification is that? The CLR or CLI? – Zach Saw Feb 14 '14 at 02:53
  • @ZachSaw: The C# specification. – Eric Lippert Feb 14 '14 at 15:39
  • 1
    @EricLippert C# ECMA-334 specifies that volatile fields provide acquire-­release semantics. Remember, I don't argue that it is not needed for weaker memory models. To say that volatile means the JIT will introduce memory barrier instructions is blatantly wrong. As explained numerous times, x86's strong memory model produces the same effect without the need for explicitly issuing memory barriers. So in effect, volatile behaves very similarly to C++ (similar not exactly). – Zach Saw Feb 18 '14 at 05:38
  • 3
    @ZachSaw: Yes: the specification says that acquire and release semantics will be imposed on volatile reads and writes. How the runtime chooses to do so is up to it; if it can get away with *not* introducing a full or half fence *instruction* due to some other guarantee made by a particular processor then it is under no obligation to generate the unnecessary code. – Eric Lippert Feb 18 '14 at 17:20
  • 2
    @EricLippert Yes. That's exactly what I've been trying to explain. Glad to see everyone finally caught up! – Zach Saw Feb 18 '14 at 23:47
  • I personally like Igor Ostrovsky's explanation of the memory model a lot. Makes you understand what volatile really does. http://msdn.microsoft.com/en-us/magazine/jj883956.aspx He also has a great blog http://www.igoro.com. – FrankyHollywood Mar 24 '14 at 11:35
  • Wouldn't "lock(this.locker) ++this.counter;" avoid the read before the optiizer got to it? – John Taylor Feb 02 '15 at 19:08
  • @JohnTaylor I'm not sure I understand what you're asking? The CPU always has to read the value from memory in order to increment it and write it back regardless of the way it gets incremented. – Orion Edwards Feb 03 '15 at 21:51
  • With the ++this.counter the compiler would generate some like **inc[ebx+0x12]** with this.counter++ it would do this with **"mov eax, [ebx+0x12]; inc eax; mov [ebx+0x12], eax"** Of course the optimizer might do the same thing already. – John Taylor Feb 03 '15 at 22:16
  • 2
    @JohnTaylor right, yeah. I inspected the assembly produced by the CLR JIT and it was emitting a single instruction for a post-increment, so the optimizer's obviously gotten to it. At any rate, **inc[ebx+0x12]** still involves the CPU doing a memory read, an increment, and a memory write – Orion Edwards Feb 05 '15 at 00:24
  • Note that if are using `Interlocked.Increment` to generate unique values, it is vital to use the return value. If you merely wish to guarantee that no increments are "lost" due to a bad thread interleaving, this does not matter. – Brian Feb 14 '19 at 16:03
163

EDIT: As noted in comments, these days I'm happy to use Interlocked for the cases of a single variable where it's obviously okay. When it gets more complicated, I'll still revert to locking...

Using volatile won't help when you need to increment - because the read and the write are separate instructions. Another thread could change the value after you've read but before you write back.

Personally I almost always just lock - it's easier to get right in a way which is obviously right than either volatility or Interlocked.Increment. As far as I'm concerned, lock-free multi-threading is for real threading experts, of which I'm not one. If Joe Duffy and his team build nice libraries which will parallelise things without as much locking as something I'd build, that's fabulous, and I'll use it in a heartbeat - but when I'm doing the threading myself, I try to keep it simple.

Yair Nevet
  • 12,725
  • 14
  • 66
  • 108
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 20
    +1 for ensuring me to forget about lock-free coding from now. – Xaqron Jan 03 '11 at 01:51
  • 7
    lock-free codes are definitely not truly lock-free as they lock at some stage - whether at (FSB) bus or interCPU level, there's still a penalty you'd have to pay. However locking at these lower levels are generally faster so long as you don't saturate the bandwidth of where the lock occurs. – Zach Saw Jul 07 '11 at 01:25
  • 5
    There is nothing wrong with Interlocked, it is exactly what your looking for and faster than a full lock() – Jaap Mar 22 '12 at 20:24
  • 5
    @Jaap: Yes, these days I *would* use interlocked for a genuine single counter. I just wouldn't want to start messing around trying to work out interactions between *multiple* lock-free updates to variables. – Jon Skeet Mar 22 '12 at 20:30
  • 1
    @ZachSaw: Being "lock free" doesn't mean that a piece of code will always run at the same speed regardless of contention. Rather, it means that there is a limit to how long a thread could be held waiting as a consequence of another thread getting waylaid indefinitely (in lock-based code, if a thread gets waylaid while holding a lock, threads which need the lock may be blocked indefinitely). Note that merely being lock-free doesn't guarantee that code won't have to wait indefinitely for a heavily-contested resource, but `Interlocked` functions provide the latter guarantee as well. – supercat Aug 14 '12 at 15:04
  • @supercat: Huh? Who or what are you replying to?? – Zach Saw Aug 17 '12 at 07:18
  • @ZachSaw: I was responding to your second comment; though it was written awhile ago, I just wanted to clarify for the benefit of future readers that interlocked operations are lock-free on modern CPUs which include hardware to prevent bus locks from being held indefinitely. – supercat Aug 17 '12 at 14:55
  • @supercat: Which comment did I say interlocked ops could cause bus locks to be held indefinitely? – Zach Saw Aug 21 '12 at 06:22
  • 7
    @ZachSaw: Your second comment says that interlocked operations "lock" at some stage; the term "lock" generally implies that one task can maintain exclusive control of a resource for an unbounded length of time; the primary advantage of lock-free programming is that it avoids the danger of resource becoming unusable as a result of the owning task getting waylaid. The bus synchronization used by the interlocked class isn't just "generally faster"--on most systems it has a bounded worst-case time, whereas locks do not. – supercat Aug 21 '12 at 15:01
  • @supercat: If you interpret "lock" like that, "interlocked" which comprises the word "lock" would mean the same thing -- it clearly doesn't! – Zach Saw Aug 28 '12 at 03:06
  • 3
    If you say "lock-free multi-threading is for real threading experts, of which I'm not one." I will never attempt to write lock-free multi-threading code. :) – Teoman shipahi Sep 29 '16 at 15:18
48

"volatile" does not replace Interlocked.Increment! It just makes sure that the variable is not cached, but used directly.

Incrementing a variable requires actually three operations:

  1. read
  2. increment
  3. write

Interlocked.Increment performs all three parts as a single atomic operation.

Michael Damatov
  • 15,253
  • 10
  • 46
  • 71
  • 5
    Said another way, Interlocked changes are full-fenced and as such are atomic. Volatile members are only partially-fenced and as such are not guarenteed to be thread-safe. – JoeGeeky Dec 04 '11 at 19:58
  • 2
    Actually, `volatile` does *not* make sure the variable is not cached. It just puts restrictions on how it can be cached. For example, it can still be cached in things the CPU's L2 cache because they're made coherent in hardware. It can still be prefected. Writes can still be posted to cache, and so on. (Which I think was what Zach was getting at.) – David Schwartz Dec 04 '15 at 12:36
45

Either lock or interlocked increment is what you are looking for.

Volatile is definitely not what you're after - it simply tells the compiler to treat the variable as always changing even if the current code path allows the compiler to optimize a read from memory otherwise.

e.g.

while (m_Var)
{ }

if m_Var is set to false in another thread but it's not declared as volatile, the compiler is free to make it an infinite loop (but doesn't mean it always will) by making it check against a CPU register (e.g. EAX because that was what m_Var was fetched into from the very beginning) instead of issuing another read to the memory location of m_Var (this may be cached - we don't know and don't care and that's the point of cache coherency of x86/x64). All the posts earlier by others who mentioned instruction reordering simply show they don't understand x86/x64 architectures. Volatile does not issue read/write barriers as implied by the earlier posts saying 'it prevents reordering'. In fact, thanks again to MESI protocol, we are guaranteed the result we read is always the same across CPUs regardless of whether the actual results have been retired to physical memory or simply reside in the local CPU's cache. I won't go too far into the details of this but rest assured that if this goes wrong, Intel/AMD would likely issue a processor recall! This also means that we do not have to care about out of order execution etc. Results are always guaranteed to retire in order - otherwise we are stuffed!

With Interlocked Increment, the processor needs to go out, fetch the value from the address given, then increment and write it back -- all that while having exclusive ownership of the entire cache line (lock xadd) to make sure no other processors can modify its value.

With volatile, you'll still end up with just 1 instruction (assuming the JIT is efficient as it should) - inc dword ptr [m_Var]. However, the processor (cpuA) doesn't ask for exclusive ownership of the cache line while doing all it did with the interlocked version. As you can imagine, this means other processors could write an updated value back to m_Var after it's been read by cpuA. So instead of now having incremented the value twice, you end up with just once.

Hope this clears up the issue.

For more info, see 'Understand the Impact of Low-Lock Techniques in Multithreaded Apps' - http://msdn.microsoft.com/en-au/magazine/cc163715.aspx

p.s. What prompted this very late reply? All the replies were so blatantly incorrect (especially the one marked as answer) in their explanation I just had to clear it up for anyone else reading this. shrugs

p.p.s. I'm assuming that the target is x86/x64 and not IA64 (it has a different memory model). Note that Microsoft's ECMA specs is screwed up in that it specifies the weakest memory model instead of the strongest one (it's always better to specify against the strongest memory model so it is consistent across platforms - otherwise code that would run 24-7 on x86/x64 may not run at all on IA64 although Intel has implemented similarly strong memory model for IA64) - Microsoft admitted this themselves - http://blogs.msdn.com/b/cbrumme/archive/2003/05/17/51445.aspx.

Zach Saw
  • 4,308
  • 3
  • 33
  • 49
  • 3
    Interesting. Can you reference this? I'd happily vote this up, but posting with some aggressive language 3 years after a highly voted answer that is consistent with the resources I've read is going to require a bit more tangible proof. – Steven Evers Jul 07 '11 at 03:28
  • If you can point to which part you want referencing, I'd be happy to dig up some stuff from somewhere (I highly doubt I've given any x86/x64 vendor trade secrets away, so these should be easily available off wiki, Intel PRMs (programmer's reference manuals), MSFT blogs, MSDN or something similar)... – Zach Saw Jul 07 '11 at 04:19
  • Also, I don't think it's that inconsistent with what the others have replied, only in their explanation -- suggesting that volatile prevents CPU from caching the result of the variable. That's utterly ridiculous. How is that consistent with anything you've read? If you can find anything in x86/x64 that allows you to set just a 32-bit/64-bit wide memory location to write-through cache or Windows allowing clients to change a specific memory location to write-through on the fly, and then adjusting that accordingly when GC compacts the heap, hence by-passing CPU cache... – Zach Saw Jul 07 '11 at 04:27
  • 2
    Why anyone would want to prevent the CPU from caching is beyond me. The whole real estate (definitely not negligible in size and cost) dedicated to perform cache coherency is completely wasted if that's the case... Unless you require no cache coherency, such as a graphics card, PCI device etc, you wouldn't set a cache line to write-through. – Zach Saw Jul 07 '11 at 04:29
  • 4
    Yes, everything you say is if not 100% at least 99% on the mark. This site is (mostly) pretty useful when you are in the rush of development at work but unfortunately the accuracy of the answers corresponding to the (game of) votes is not there. So basically in stackoverflow you can get a feeling of what is the popular understanding of the readers not what it really is. Sometimes the top answers are just pure gibberish - myths of kind. And unfortunately this is what breeds into the folks who come across the read while solving the problem. It's understandable though, nobody can know everything. – user1416420 Dec 14 '12 at 08:03
  • 1
    The problem with this answer, and your comments everywhere on this question, is that it's exclusive to x86, when the question was not. Knowing about the underlying hardware memory model is useful at times, but doesn't replace knowledge of the CLR memory model. For example, just because a memory barrier is implicit on x86 does not mean that the CLR memory model doesn't require memory barriers for `volatile` (more than C++ `volatile`). .NET code runs on a half a dozen architectures, and C++ far more than that. – Ben Voigt Feb 10 '13 at 16:19
  • 1
    @BenVoigt I could go on and answer about all the architectures .NET runs on, but that would take a few pages and is definitely not suitable for SO. It is far better to educate people based on the most widely used .NET underlying hardware mem-model than one that is arbitrary. And with my comments 'everywhere', I was correcting the mistakes people were making in assuming flushing / invalidating the cache etc. They made assumptions about the underlying hardware without specifying which hardware. – Zach Saw Feb 10 '13 at 22:39
  • @BenVoigt Also do note that Microsoft stuffed up their ECMA memory model specs to begin with, so there's no blanketing answer to a question like this. The question also does not mention the CLR memory model. ECMA is the standard, not CLR. And ECMA does not mandate memory barrier. So perhaps you were just being pedantic when you posted your comment? – Zach Saw Feb 10 '13 at 22:45
  • 1
    @Zach: Even the ECMA model gives acquire and/or release semantics (not full barrier, but these are also types of barriers/fences) to volatile accesses, does it not? – Ben Voigt Feb 11 '13 at 01:49
  • @BenVoigt: From memory, it specifies weak memory model - hence no guarantees for volatile accesses. CLR on the other hand implements strong memory model on IA64. – Zach Saw Feb 11 '13 at 06:17
  • 1
    @Zach: According to [Microsoft's explanation which you linked to](http://msdn.microsoft.com/en-us/magazine/cc163715.aspx#S3), it's not that weak. It doesn't do anything extra on x86, because x86 itself makes stronger guarantees, but don't mistake that for thinking that `volatile` doesn't carry any guarantee. – Ben Voigt Feb 11 '13 at 06:57
  • @BenVoigt: Like I said in my answer, I'm not making any assumptions for any other architectures (others were -- which was what prompted me to post this answer). However, without a reference (from official Microsoft source on ECMA std) it's all but conjecture on both our parts. – Zach Saw Feb 12 '13 at 00:28
  • @ZachSaw, The only good answer here and the very first one I read which really explain clearly why and what is "Volatile" for. Thanks you very much. – Eric Ouellet Dec 09 '20 at 19:09
  • @ZachSaw • 8.10 Execution order _The ordering of side effects is preserved with respect to volatile reads and writes._ • 15.5.4 Volatile fields ◾ _A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence._ ◾ _A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence._ ECMA-334. – Yarl Jan 05 '21 at 09:53
18

Interlocked functions do not lock. They are atomic, meaning that they can complete without the possibility of a context switch during increment. So there is no chance of deadlock or wait.

I would say that you should always prefer it to a lock and increment.

Volatile is useful if you need writes in one thread to be read in another, and if you want the optimizer to not reorder operations on a variable (because things are happening in another thread that the optimizer doesn't know about). It's an orthogonal choice to how you increment.

This is a really good article if you want to read more about lock-free code, and the right way to approach writing it

http://www.ddj.com/hpc-high-performance-computing/210604448

Lou Franco
  • 87,846
  • 14
  • 132
  • 192
13

lock(...) works, but may block a thread, and could cause deadlock if other code is using the same locks in an incompatible way.

Interlocked.* is the correct way to do it ... much less overhead as modern CPUs support this as a primitive.

volatile on its own is not correct. A thread attempting to retrieve and then write back a modified value could still conflict with another thread doing the same.

Rob Walker
  • 46,588
  • 15
  • 99
  • 136
10

I did some test to see how the theory actually works: kennethxu.blogspot.com/2009/05/interlocked-vs-monitor-performance.html. My test was more focused on CompareExchnage but the result for Increment is similar. Interlocked is not necessary faster in multi-cpu environment. Here is the test result for Increment on a 2 years old 16 CPU server. Bare in mind that the test also involves the safe read after increase, which is typical in real world.

D:\>InterlockVsMonitor.exe 16
Using 16 threads:
          InterlockAtomic.RunIncrement         (ns):   8355 Average,   8302 Minimal,   8409 Maxmial
    MonitorVolatileAtomic.RunIncrement         (ns):   7077 Average,   6843 Minimal,   7243 Maxmial

D:\>InterlockVsMonitor.exe 4
Using 4 threads:
          InterlockAtomic.RunIncrement         (ns):   4319 Average,   4319 Minimal,   4321 Maxmial
    MonitorVolatileAtomic.RunIncrement         (ns):    933 Average,    802 Minimal,   1018 Maxmial
Kenneth Xu
  • 1,514
  • 15
  • 15
  • The code sample you tested was soooo trivial though - it really doesn't make much sense testing it that way! The best would be to understand what the different methods are actually doing and use the appropriate one based on the usage scenario you have. – Zach Saw Jun 23 '11 at 15:10
  • @Zach, the how discussion here was about the scenario of increasing a counter in a thread safe manner. What other usage scenario was in your mind or how would you test it? Thanks for the comment BTW. – Kenneth Xu Jun 27 '11 at 21:05
  • Point is, it's an artificial test. You're not going to hammer the same location that often in any real world scenario. If you are, then well you're bottlenecked by the FSB (as shown in your server boxes). Anyway, look at my reply on your blog. – Zach Saw Jul 07 '11 at 01:18
  • 3
    Looking it back again. If the true bottleneck is with FSB, the the monitor implementation should observe the same bottleneck. The real difference is that Interlocked is doing busy wait and retry which becomes a real issue with high performance counting. At least I hope my comment raise the attention that Interlocked is not always the right choice for counting. The fact the people are looking at alternatives well explained it. You need a long adder http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/LongAdder.html – Kenneth Xu Oct 06 '13 at 16:01
4

I would like to add to mentioned in the other answers the difference between volatile, Interlocked, and lock:

The volatile keyword can be applied to fields of these types:

  • Reference types.
  • Pointer types (in an unsafe context). Note that although the pointer itself can be volatile, the object that it points to cannot. In other words, you cannot declare a "pointer" to be "volatile".
  • Simple types such as sbyte, byte, short, ushort, int, uint, char, float, and bool.
  • An enum type with one of the following base types: byte, sbyte, short, ushort, int, or uint.
  • Generic type parameters known to be reference types.
  • IntPtr and UIntPtr.

Other types, including double and long, cannot be marked "volatile" because reads and writes to fields of those types cannot be guaranteed to be atomic. To protect multi-threaded access to those types of fields, use the Interlocked class members or protect access using the lock statement.

T.S.
  • 18,195
  • 11
  • 58
  • 78
V. S.
  • 1,086
  • 14
  • 14
1

I'm just here to point out the mistake about volatile in Orion Edwards' answer.

He said:

"If it is volatile, this just ensures the two CPUs see the same data at the same time."

It's wrong. In microsoft' doc about volatile, mentioned:

"On a multiprocessor system, a volatile read operation does not guarantee to obtain the latest value written to that memory location by any processor. Similarly, a volatile write operation does not guarantee that the value written would be immediately visible to other processors."

HeSansi
  • 11
  • 1
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/30872542) – sjakobi Jan 28 '22 at 01:02