How to guarantee that an update to "reference type" item in Array is visible to other threads?

Question

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null)
    {
        return;
    }
    instrumentInfos[instrument.Id] = info;  // need to make it visible to other threads!
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    return instrumentInfos[instrument.Id];  // need to obtain fresh value!
}

SetInstrumentInfo and GetInstrumentInfo are called from different threads. InstrumentInfo is immutable class. Am I guaranteed to have the most recent copy when calling GetInstrumentInfo? I'm afraid that I can receive "cached" copy. Should I add kind of synchronization?

Declaring instrumentInfos as volatile wouldn't help because I need declare array items as volatile, not array itself.

Do my code has problem and if so how to fix it?

UPD1:

I need my code to work in real life not to correspond to all specifications! So if my code works in real life but will not work "theoretically" on some computer under some environment - that's ok!

I need my code to work on modern X64 server (currently 2 processors HP DL360p Gen8) under Windows using latest .NET Framework.
I don't need to work my code under strange computers or Mono or anything else
I don't want to introduce latency as this is HFT software. So as " Microsoft's implementation uses a strong memory model for writes. That means writes are treated as if they were volatile" I likely don't need to add extra Thread.MemoryBarrier which will do nothing but add latency. I think we can rely that Microsoft will keep using "strong memory model" in future releases. At least it's very unlikely that Microsoft will change memory model. So let's assume it will not.

UPD2:

The most recent suggestion was to use Thread.MemoryBarrier();. Now I don't understand exact places where I must insert it to make my program works on standard configuration (x64, Windows, Microsoft .NET 4.0). Remember I don't want to insert lines "just to make it possible to launch your program on IA64 or .NET 10.0". Speed is more important for me than portability. However it would be also interesting how to update my code so it will work on any computer.

UPD3

.NET 4.5 solution:

    public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
    {
        if (instrument == null || info == null)
        {
            return;
        }
        Volatile.Write(ref instrumentInfos[instrument.Id], info);
    }

    public InstrumentInfo GetInstrumentInfo(Instrument instrument)
    {
        InstrumentInfo result = Volatile.Read(ref instrumentInfos[instrument.Id]);
        return result;
    }

@svick so I have to use `Thread.MemoryBarrier();`? the code as it wrote above will not work? — Oleg Vazhnev, Aug 14 '12 at 13:07
Yes, that's one option. But I would prefer using `Thread.Volatile(Read|Write)` (or `Volatile.(Read|Write)`), if you can. And yes, I think your code won't work in that it can read stale data. — svick, Aug 14 '12 at 13:09
Worse. It'll work most of the time, but has the potential to not work; a potential that varies depending on which machine it is running on. Code that doesn't work is much easier to fix than that. — Jon Hanna, Aug 14 '12 at 15:56
@JonHanna on which machine this code will work? if it will work on x64 machine then I will prefer not to introduce anything. because i don't need latency because it's part of HFT software. — Oleg Vazhnev, Aug 14 '12 at 16:02
@javapowered off the top of my head, I'm pretty sure x86 would have it work, and don't know about the rest (as interesting as I find these details, I'd just put in a barrier and not think about the different memory models of different cpus). — Jon Hanna, Aug 14 '12 at 17:49
When I first read this question, I wondered if the assignment itself was thread-safe. If 'info' in the setter was a long, ulong, or double it might be possible for the getter to get an instrumentInfos[id] which was only half-set, (i.e. the first 32 bits were set with new values, but the next 32 bits are still older values, using a 32-bit computer). However, assignment of objects are machine word size pointers and are assigned atomically, so you are okay there, unless it is a struct (not sure about structs, are they pointers to the struct, or the values?). — Xantix, Aug 14 '12 at 19:03
@Xantix i.e. *reference* assignments are guaranteed to be thread-safe in both the C# spec and in the CLi. — Peter Ritchie, Aug 14 '12 at 19:07
Basically, the whole question is void, i.e. it is obviously the wrong question to ask. Your problem cannot possibly be what you are asking. Look further down in the answers section for details, where it talks about interprocessor interrupts, cache coherency and reader-writer locks. — Dimitrios Staikos, Aug 15 '12 at 07:40
@DimitrisStaikos could you add implementation that you suggest? InstrumentInfo is immutable class. — Oleg Vazhnev, Aug 15 '12 at 07:45

Igor ostrovsky · Accepted Answer · 2012-08-16T00:03:34.680

This is a question with a long and complicated answer, but I'll try to distill it into some actionable advice.

1. Simple solution: only access instrumentInfos under a lock

The easiest way to avoid the unpredictability in multi-threaded programs is to always protect shared state using locks.

Based on your comments, it sounds like you consider this solution to be too expensive. You may want to double-check that assumption, but if that's really the case, then let's look at the remaining options.

2. Advanced solution: Thread.MemoryBarrier

Alternatively, you can use Thread.MemoryBarrier:

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM]; 

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info) 
{ 
    if (instrument == null || info == null) 
    { 
        return; 
    } 

    Thread.MemoryBarrier(); // Prevents an earlier write from getting reordered with the write below

    instrumentInfos[instrument.Id] = info;  // need to make it visible to other threads! 
} 

public InstrumentInfo GetInstrumentInfo(Instrument instrument) 
{ 
    InstrumentInfo info = instrumentInfos[instrument.Id];  // need to obtain fresh value! 
    Thread.MemoryBarrier(); // Prevents a later read from getting reordered with the read above
    return info;
}

Using the Thread.MemoryBarrier before the write and after the read prevents the potential trouble. The first memory barrier prevents the writing thread from reordering a write that initializes the object's field with the write that publishes the object into the array, and the second memory barrier prevents the reading thread from reordering the read that receives the object from the array with any subsequent reads of the fields of that object.

As a side note, .NET 4 also exposes Thread.VolatileRead and Thread.VolatileWrite that use Thread.MemoryBarrier as shown above. However, there is no overload of Thread.VolatileRead and Thread.VolatileWrite for reference types other than System.Object.

3. Advanced solution (.NET 4.5): Volatile.Read and Volatile.Write

.NET 4.5 exposes Volatile.Read and Volatile.Write methods that are more efficient than full memory barriers. If you are targeting .NET 4, this option won't help.

4. "Wrong but will happen to work" solution

You should never ever rely on what I'm about to say. But... it is very unlikely that you'd be able to reproduce the issue that is present in your original code.

In fact, on X64 in .NET 4, I would be extremely surprised if you could ever reproduce it. X86-X64 provides fairly strong memory reordering guarantees, and so these kinds of publication patterns happen to work correctly. The .NET 4 C# compiler and .NET 4 CLR JIT compiler also avoid optimizations that would break your pattern. So, none of the three components that are allowed to reorder the memory operations will happen to do so.

That said, there are (somewhat obscure) variants of the publication pattern that actually don't work on .NET 4 in X64. So, even if you think that the code will never need to run on any architecture other than .NET 4 X64, your code will be more maintainable if you use one of the correct approaches, even though the issue is not presently reproducible on your server.

How this prevent computer from using cached value? I need every time to write value to main memory instead of cache (so another processor can use it). So the question is not about "reordering"... another question: in .NET 4.5 can I use Volatile.Read for array items? — Oleg Vazhnev, Aug 16 '12 at 07:39
X64 uses a variant of the [MESI protocol](http://en.wikipedia.org/wiki/MESI_protocol) to keep caches coherent. If a processor modifies a value in its cache, the values in other caches get invalidated. So, cache coherency is not an issue on X64. — Igor ostrovsky, Aug 16 '12 at 15:09
The real issue is due to "reordering" in the compiler (i.e., optimizations) and reordering of instructions as they execute in the processor pipeline (very limited on X64). — Igor ostrovsky, Aug 16 '12 at 15:11
thanks Igor. I've decided to use .NET 4.5 VolatileRead / VolatileWrite. I've added my solution to my question description. If you can review it it would be great! — Oleg Vazhnev, Aug 17 '12 at 21:51

score 2 · Answer 2 · edited May 23 '17 at 10:27

My preferred way to resolve this is with a critical section. C# has had a built-in language construct called "Lock" that solves this problem. The lock statement makes sure no more than one thread can be inside that critical section at the same time.

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null) {
        return;
    }
    lock (instrumentInfos) {
        instrumentInfos[instrument.Id] = info;
    }
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    lock (instrumentInfos) {
        return instrumentInfos[instrument.Id];
    }
}

This does have performance implications, but it always ensures that you will get a reliable result. This is especially useful if you ever iterate over the instrumentInfos object; you'll absolutely need a lock statement for code like that.

Note that lock is a general purpose solution that ensures any complex statements are executed reliably and atomically. In your case, because you're setting an object pointer, you may find that it's possible to use a simpler construct like a thread-safe list or array - provided these are the only two functions that ever touch instrumentInfos. More details are available here: Are C# arrays thread safe?

EDIT: The key question about your code is: what is InstrumentInfo? If it is derived from object, you will need to be careful to always construct a new InstrumentInfo object each time you are updating the shared array since updating an object on a separate thread can cause issues. If it is a struct, the lock statement will provide all the security you need since the values within it will be copied on write and read.

EDIT 2: From a bit more research, it appears that there does exist some cases where changes to a thread-shared variable won't appear in certain compiler configurations. This thread discusses a case where a "bool" value can be set on one thread and never retrieved on the other: Can a C# thread really cache a value and ignore changes to that value on other threads?

I notice, however, that this case only exists where a .NET value type is involved. If you look at Joe Ericson's response. the decompilation of this particular code shows why: the thread-shared value is read into a register, then never re-read since the compiler optimizes away the unnecessary read in the loop.

Given that you're using an array of shared objects, and given that you've encapsulated all accessors, I think you are perfectly safe using your raw methods. HOWEVER: The performance loss by using locking is so tiny as to be virtually forgettable. For example, I wrote a simple test application that tried calling SetInstrumentInfo and GetInstrumentInfo 10 million times. The performance results were, as follows:

Without Locking, 10 million set and get calls: 2 seconds 149 ms
With locking, 10 million set and get calls: 2 seconds 195 ms

This translates into raw performance of roughly:

Without locking, you can execute 4653327 get & set calls per second.
With locking, you can execute 4555808 get & set calls per second.

From my perspective, the lock() statement is worth the 2.1% performance tradeoff. In my experience, get and set methods like these occupy only the tiniest fraction of the application's execution time. You'd probably be best off looking for potential optimizations elsewhere, or spending additional money to get a higher clock rate CPU.

Sorry, I have a very inexact way of writing things - I was thinking "mutual exclusion" rather than "C# Mutex Class". I'll rewrite to make that more clear. — Ted Spence, Aug 14 '12 at 19:25
lock is too expensive. I want to avoid locks if possible as i'm writing HFT. I will probably use `Thread.MemoryBarrier();` as suggested above. — Oleg Vazhnev, Aug 14 '12 at 20:20
From the documentation, Thread.MemoryBarrier probably won't affect your code: http://stackoverflow.com/questions/3556351/why-we-need-thread-memorybarrier . If you're simply concerned about highest possible performance, you could A) ensure that `InstrumentInfo` is an `object`, and B) run without a `lock()` statement, and C) ensure that all the functions that modify the shared array are simple. In that case, you can rely on the fact that an object reference is a 64-bit pointer and generally likely to be updated in a single atomic instruction. — Ted Spence, Aug 14 '12 at 20:24
@TedSpence we are talking about caching. "atomacity" is not enough. to prevent caching I have to use some kind of inter-process synchronization. Otherwise each thread will use it's own "cache". — Oleg Vazhnev, Aug 15 '12 at 04:07
This is an interesting conversation, thank you for sparking it! I'm not certain by what mechanism a thread would "cache" a value. Can you elaborate? Are you retrieving the value from the array, then storing it in a local variable? — Ted Spence, Aug 15 '12 at 21:46
@TedSpence each processor CPU (or core) has own "very fast cache". so two different threads may refer the same variable that stored actually in physically different places. I was reading brilliant article about .net memory model but I don't know how to find it know. When you use "lock" you told computer to move "cached" value to main memory or to retrieve object from "main memory" instead of "cache" — Oleg Vazhnev, Aug 16 '12 at 07:33
@TedSpence probably this article can help http://msdn.microsoft.com/en-us/magazine/cc163715.aspx — Oleg Vazhnev, Aug 16 '12 at 07:43
Virtually all modern CPUs will invalidate caches. As soon as a write is committed to physical memory, a modern CPU will notify all other processors to invalidate their caches for that memory location. However, in most cases, that "write to physical memory" statement may take place hundreds of microseconds after the CPU executed the write instruction. The `lock` statement doesn't update the cache itself, it makes sure that each CPU waits for write instructions to propagate before accessing the memory. — Ted Spence, Aug 16 '12 at 16:19

Peter Ritchie · Answer 3 · 2012-08-14T20:39:51.417

You can use a system-level synchronization primitive (like Mutex); but those are bit heavy-handed. Mutex is really slow because it's a kernal-level primitive. This means you can have mutually exclusive code across processes. This isn't want you need and you could use something much less costly in performance like lock or Monitor.Enter or Monitor.Exit that works within a single process.

You can't use something like VolatileRead/Write or MemoryBarrier because you need to make the act of writing to the collection atomic if elements in the array are not assigned atomically. VolatileRead/Write or MemoryBarrier doesn't do that. These just give you acquire and release semantics. It means that whatever is written to is "visible" to other threads. If you don't make the write to the collection atomic, you can corrupt data--acquire/release semantics won't help with that.

e.g.:

private InstrumentInfo[] instrumentInfos = new InstrumentInfo[Constants.MAX_INSTRUMENTS_NUMBER_IN_SYSTEM];

private readonly object locker = new object();

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null)
    {
        return;
    }
    lock(locker)
    {
        instrumentInfos[instrument.Id] = info;
    }
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    lock(locker)
    {
        return instrumentInfos[instrument.Id];
    }
}

A concurrent collection would do that same; but, you incur a cost because every operation is guarded by synchronization. Plus, a concurrent collection only knows about atomicity of accesses to elements, you'd still have to ensure atomicity at your applications level: If you did the following with a concurrent collection:

public bool Contains(Instrument instrument)
{
   foreach(var element in instrumentInfos)
   {
       if(element == instrument) return true;
   }
}

...you'd have at least a couple problems. One, you're not stopping SetInstrumentInfo from modifying the collection while it's being enumerated (many collections do not support this and throw and exception). i.e. the collection is only "guarded" in the retrieval of a single element. Two, the collection is guarded at each iteration. If you have 100 elements and the concurrent collection uses a lock, you get 100 locks (assuming the element to find is last or not found at all). This would be much slower than it needs to be. If you don't use a concurrent collection you could simply use lock to get the same result:

public bool Contains(Instrument instrument)
{
   lock(locker)
   {
      foreach(var element in instrumentInfos)
      {
          if(element == instrument) return true;
      }
   }
}

and have a single lock and be much more performant.

UPDATE I think it's important to point out that if InstrumentInfo is a struct, use of lock (or some other synchronization primitive) becomes even more important because it would have value semantics and every assignment would require moving as many bytes as InstrumentInfo uses to store its fields (i.e. it's no longer a 32-bit or 64-bit native word reference assignment). i.e. a simple InstrumentInfo assignment (e.g. to an array element) would never be atomic and thus not thread-safe.

UPDATE 2 If the operation to replace an array element is potentially atomic (becomes simply a reference assignment if InstrumentInfo is a reference and the IL instruction stelem.ref is atomic). (i.e. the act of writing an element to an array is atomic w.r.t. to what I mention above) In which case if the only code that deals with instrumentInfos is what you posted then you can use Thread.MemoryBarrier():

public void SetInstrumentInfo(Instrument instrument, InstrumentInfo info)
{
    if (instrument == null || info == null)
    {
        return;
    }
    Thread.MemoryBarrier();
    instrumentInfos[instrument.Id] = info;
}

public InstrumentInfo GetInstrumentInfo(Instrument instrument)
{
    var result = instrumentInfos[instrument.Id];
    Thread.MemoryBarrier();
    return result;
}

... which would be the equivalent of declaring each element in the array volatile if that were possible.

VolatileWrite won't work because it requires a reference to the variable it will write to, you can't give it the reference to an array element.

+1, but make the locker object readonly to prevent anyone accidentally reassigning it. — Xantix, Aug 14 '12 at 19:09
@Xantix Useful point (updated code), but it doesn't prevent a field from changing value, it just prevents assignment to the field outside the constructor. But, this would prevent the field from changing while it could be used in any of the locks shown in the code... — Peter Ritchie, Aug 14 '12 at 19:12
Given that the original question seems to be using a global static value - probably reading from a MIDI device or something similar - I figured it was safe to lock directly on the array itself. Thanks for adding the description of how to do a "foreach" safely! — Ted Spence, Aug 14 '12 at 19:29
@TedSpence If the array represents *all* operations that need to be atomic amongst themselves, then you can lock on the array. If you have multiple reasons for atomic operations using the array, then another object (objects) is necessary. The very low overhead of allocating another object to lock just makes it easier to always use another object. If the array was ever made visible outside the class, a potential deadlock is introduced. I don't see anything in the post about a static value... — Peter Ritchie, Aug 14 '12 at 19:33
"If you don't make the write to the collection atomic, you can corrupt data--acquire/release semantics won't help with that." is wrong, unless InstrumentInfo is a struct (or I suppose if this actually were a collection- it is not, it's an array). Assignment to array elements is an atomic operation because it means writing a managed pointer, which is within the word size boundary. — Chris Shain, Aug 14 '12 at 19:37
@ChrisShain we don't know if InstrumentInfo is a struct, and I can't find anything that definitively says that and array element replacement (stelem.ref) is just a reference assignment and thus atomic. — Peter Ritchie, Aug 14 '12 at 19:43
It's implied pretty strongly in sections III.4.26 & III.4.27 of the CLI spec: http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-335.pdf, especially when combined with section I.12.6.6 on atomic access. — Chris Shain, Aug 14 '12 at 19:51
@ChrisShain If these values are being written/read thousands of times a second, `MemoryBarrier` might have a minute performance improvement. Otherwise, `lock` is generally easier to use and understand by a wider audience. — Peter Ritchie, Aug 14 '12 at 19:57
InstrumentInfo is a class. Actuallly i never use `struct` in C#. Do they still exist? :) — Oleg Vazhnev, Aug 14 '12 at 20:16
@javapowered So, if you trust ECMA 335 that reference array writes are atomic, and you only have the code you posted, then you can just use `MemoryBarrier` as I've described in my answer. — Peter Ritchie, Aug 14 '12 at 20:34
why did you inserted `Thread.MemoryBarrier()` in these places? here guys insert it before and after each write and read http://stackoverflow.com/questions/3556351/why-we-need-thread-memorybarrier Also because of that: " Microsoft's implementation uses a strong memory model for writes. That means writes are treated as if they were volatile" it seems i don't need MemoryBarrier for write. — Oleg Vazhnev, Aug 15 '12 at 05:51
@javapowered That's appreciably the order you'd get with `VolatileRead` and `VolatileWrite`. For some detail on *why* they are where they are: http://stackoverflow.com/questions/1773680/thread-volatileread-implementation — Peter Ritchie, Aug 15 '12 at 12:54
@PeterRitchie ok thanks, do you know how can I test if this fix my problem? Well actually I need to test that without `MemoryBarrier` I do have problem. But i don't know how to do that. I can't use `Thread.Sleep` for example as this call will generate `MemoryBarrier` which I need to avoid in testing... — Oleg Vazhnev, Aug 15 '12 at 20:23
Instead of sleep you could use something like while(i<1000000)EmptyMethod(); — Peter Ritchie, Aug 15 '12 at 21:40

Dimitrios Staikos · Answer 4 · 2012-08-16T10:14:21.953

As far as I know from my years of device driver programming, volatile is used for stuff that can change outside the control of the CPU, that is through hardware intervention, or for hardware memory mapped into system space (CSRs and such). When a thread updates a memory location the CPU locks the cache line and raises an interprocessor interrupt so that other CPUs can discard it. So there is no chance you read stale data. Theoretically you might worry only about concurrent writes and reads to the same array position, if the data is not atomic (multiple quads), since a reader might read partially updated data. I think this cannot occur on an array of references.

I will take a wild guess at what you are trying to achieve, because it looks similar to an app+driver I developed in the past, for displaying streaming video from a cellphone camera tester board. The video display app could apply some basic image manipulation on each frame (white balance, offset pixels, etc). These settings were SET from the UI and GET from the processing threads. From the point of view of the processing thread the settings were immutable. Looks like you are trying to do something similar with audio of some sort instead of video.

The approach I took in my C++ app was to copy the current settings to a struct that accompanied the 'frame'. So each frame had its own copy of the settings that would be applied to it. The UI thread did a lock to write changes to the settings and the processing threads did a lock to copy the settings. The lock was required because multiple quads were moved around, and without a lock we have the certainty of reading partially updated settings. Not that it would matter much, since probably no one would notice the messed up frame during streaming, but if they paused the video or saved a frame to disk, then they would most certainly spot a shiny green pixel amidst a dark wall. In the case of sound, glitches are much easier to spot even during steaming.

That was case one. Let's see case two.

How do you make radical reconfiguration to a device while the device is being used by an unknown number of threads? If you just go ahead and do it, then you are guaranteed that several threads will start with configuration A and in the process be confronted with configuration B, which with a high degree of certainty means death. That's where you need stuff like reader-writer locks. That is, a synchronization primitive that will allow you to wait for current activity to finish, while blocking new activity. That's the essence of a read-writer lock.

End of case two. Now let's see what's your trouble, if of course my wild guess is right.

If your worker threads do a GET at the beginning of their processing cycle and hold on to that reference for the whole cycle then you don't need locks since reference updates are atomic as Peter mentioned. This is the implementation I suggest to you, and it is equivalent to my copying of the settings structure at the beginning of processing of each frame.

However, if you are making multiple GETs from all over the code, then you are in trouble, because within the same processing cycle some calls will return reference A and some calls will return reference B. If that's OK with your app, then you are a lucky guy.

If however you have a problem with this issue, then you have to either fix the bug or try to patch it by building on top of it. Fixing the bug is simple: Eliminate multiple GETs, even if that costs you a minor rewrite so that you can pass the reference around.

If you want to patch the bug, then use Reader-Writer locks. Each thread acquires a reader lock, does a GET, and then holds on to the lock until its cycle is finished. Then it releases the reader lock. The thread that updates the array acquires a writer lock, does the SET and releases the writer lock immediately.

The solution almost works, but it stinks when code holds locks for an extended period of time, even if it's reader locks.

However, even if we forget multithreaded programming proper, there is an even subtler problem. A thread acquires a read lock at the beginning of the cycle, releases it at the end, then starts a new cycle and reacquires it. When the writer appears the thread cannot start its new cycle until the writer is done, which means until all other reader threads that are working right now are done with their processing. This means that some threads will experience unexpectedly high delays before restarting their cycle, because even if they are high priority threads they are blocked by the request for a writer lock, until all current readers also finish their cycle. If you are doing audio or something else that is time critical, then you might want to avoid such behavior.

I hope I made a good guess, and that my explanations were clear :-)

`volatile` is used to give memory access acquire or release semantics. This is necessary even if just the CPU is accessing memory in the context of multiple cores. Each core can cache values in its own cache that other cores can't see. Acquire semantics and release semantics mean that cached values will be flushed to physical memory so the read will see any and all writes to that memory. `volatile` also means the compiler can't optimized variables into a register for compilers that do that sort of thing. — Peter Ritchie, Aug 14 '12 at 19:53
[link](http://en.wikipedia.org/wiki/Inter-processor_interrupt) And I quote "An inter-processor interrupt (IPI) is a special type of interrupt by which one processor may interrupt another processor in a multiprocessor system. IPIs are typically used to implement a CACHE COHERENCY synchronization point". After 10 years of driver programming and multi-threaded C/C++ programming I never had to use volatile or memory barrier manipulation. Obviously each CPU cannot see what's inside other CPUs, BUT it can send them an IPI to clear particular cache lines IF they happen to have them in their cache. — Dimitrios Staikos, Aug 14 '12 at 20:04
Basically the whole question asked is completely void. Even if you use lock for reads and writes, so what? What if thread A got the lock and read the value then thread B got the lock and updated the value? Thread A will still continue using a "stale" object until its work is done. The real problem is to prevent some threads using the old objects and some the new objects at the SAME time. What is needed is a System.Threading.ReadWriterLock. — Dimitrios Staikos, Aug 14 '12 at 20:10
No, the start of a `lock` is a volatile read and the end of a `lock` is a volatile write, so you get the acquire and release semantics the guarantee that all cores see all writes make within a `lock` block at the end of the `lock` block. — Peter Ritchie, Aug 14 '12 at 20:32
Of course a volatile write guarantees some stuff including barriers that the compiler/OS will put, but (a) you already have the guarantees you need because of cache coherency IF you are writing atomic data and (b) the guy needs a ReadWriterLock with threads holding a reader lock while using an object they got, and threads asking for a writer lock when they want to update the array. That is his problem not volatile stale reads. In the first place .NET is a funny environment to talk about volatile and memory barriers. If you need volatile and barriers then you are probably NOT programming in C#. — Dimitrios Staikos, Aug 15 '12 at 05:37
I would suggest that for a starters you download the WDK and read the folloing topics: "Flushing Cached Data during DMA Operations" and "KeFlushIoBuffers". They are not available online unfortunately. With all the budges you have earned you should be able to make some sense on what they are talking about. Btw, when you don't understand how things work at the hardware level, when you haven't debugged BSODs and NMIs for a living, you should not be that adamant in your posts. "Acquire and Release semantics" is not kernel lingo, you can guess what that implies for the one who utters it. — Dimitrios Staikos, Aug 15 '12 at 06:00
What are you thoughts on http://www.bluebytesoftware.com/blog/2008/06/13/VolatileReadsAndWritesAndTimeliness.aspx — Peter Ritchie, Aug 15 '12 at 12:49
Hi Peter, that's an interesting post. It fully backs my view that fences, memory barriers and instruction reordering are of concern ONLY to OS writers and compiler writers. Similarly volatile, TLAs, spinlocks and the like are only of concern to driver writers, because when some hardware, other than the CPU, is modifying memory contents, then the compiler/CPUs must have a hint about what to expect. That's why I say that the current question is void. As Fowler says "it stinks" from a multithreaded programming point of view. — Dimitrios Staikos, Aug 16 '12 at 00:02
I think I see where the disconnection is, what you should have gotten out of that was the IA86 example. When you're developming a C# application you're not targeting a specific CPU memory model, you're targeting the CLR memory model. MemoryBarrier is documented as only being required for CPUs with weak memory ordering. On x86, it's pointless (and might be simply optimized away by the JIT). If you force your assembly to x86, you would only ever need `volatile`, just for the compiler. I've run into issues with 64-bit AMD and Itanium (and I think x64) with C# code — Peter Ritchie, Aug 16 '12 at 20:12

score 0 · Answer 5 · answered Aug 21 '12 at 12:38

For some reason that I don't have time to find out about, I could not add any comments anywhere, so I added another answer. I apologize for the inconvenience.

UPD3 works of course, but is dangerously misleading.

If UPD3 solution is actually required then think of the following scenario: ThreadA runs on CPU1 and does an atomic update to Var1. Then ThreadA gets preempted and the scheduler decides to reschedule it on CPU2. Ooopss... Since Var1 is in the cache of CPU1, then according to the totally mistaken logic of UPD3, ThreadA should have done a volatile write to Var1, then do a volatile read. Oooopsss... A thread never knows when it gets rescheduled or on which CPU it will end up. So, according to the totally mistaken logic of UPD3, the thread should always do volatile writes and volatile reads, otherwise it might read stale data. Which means that all threads should all the time do volatile read/writes.

For a single thread, the problem would be even worse when updating a multibyte structure (non atomic update). Imagine that some of the updates happened on CPU1, some on CPU2, some on CPU3, etc.

It kinda makes me wonder why the world has not come to an end yet.

I have submitted another solution in the same question. It doesn't contain copypaste-ready code, and it contains a fair amount of text. Sorry for the long text but the topic is quite complicated. If you read it and still something that I write is unclear then I will be most glad to help further. — Dimitrios Staikos, Aug 22 '12 at 13:26
@Dmitris Staikos if you say that my solution is "dangerously misleading" can you provide or suggest solution that doesn't? otherwise I have to use "dangerously misleading" solution because there are not another solution. I will read your answer below right away — Oleg Vazhnev, Aug 22 '12 at 13:40

score -1 · Answer 6 · answered Aug 14 '12 at 18:35

-1

Alternatively, you could use some of the classes in the System.Collections.Concurrent namespace, such as the ConcurrentBag<T> class. These classes are built to be thread-safe.

http://msdn.microsoft.com/en-us/library/system.collections.concurrent.aspx

answered Aug 14 '12 at 18:35

Joshua

4,099
25
37

Obviously it using something "for synchronization" inside. What? `lock` or `MemoryBarrier`?. i can just use the same. – Oleg Vazhnev Aug 14 '12 at 20:26

score -1 · Answer 7 · answered Aug 14 '12 at 18:35

-1

You can leverage ConcurrentDictionary in this case. It's thread safe.

answered Aug 14 '12 at 18:35

Ankush

2,454
2
21
27

Obviously it using something "for synchronization" inside. What? `lock` or `MemoryBarrier`?. i can just use the same. – Oleg Vazhnev Aug 14 '12 at 20:24
@javapowered I don't know about internals of ConcurrentDictionary, but I do know about ConcurrentQueue. It uses integer array for corresponding for each element. It uses Interlocked Increment and Decrement. No locks. Memory barrier in sense that each thread is assigned a slot in that array. You can use reflector to examine more. – Ankush Aug 14 '12 at 21:19

score -1 · Answer 8 · answered Aug 14 '12 at 19:15

-1

Very interesting question. I hope that Eric or Jon come in to set the record straight, but until then let me try my best. Also, let me prefix this by saying that I am not 100% sure that this answer is correct.

Normally, when dealing with "do I make this field volatile?" questions, you are worried about situations like this:

SomeClass foo = new SomeClass("foo");

void Thread1Method() {
    foo = new SomeClass("Bar");
    ....
}

void Thread2Method() {
    if (foo.Name == "foo") ...
    ...
    // Here, the old value of foo might be cached 
    // even if Thread1Method has already updated it.
    // Making foo volatile will fix that.
    if (foo.Name == "bar") ...
}

In your case, you are not changing the address of your overall array, but you are changing the value of an element within that array, which is a managed pointer of (presumably) the native word size. These changes are atomic, but I am not sure that they are guaranteed to have a memory barrier (acquire/release semantics, as per the C# spec) and flush the array itself. Therefore, I'd suggest adding a volatile on the array itself.

If the array itself is volatile, then I do not think that it is possible for the reading thread to cache an item out of the array. You don't appear to need to make the elements volatile because you are not modifying the elements in-place in the array- you are replacing them outright with new elements (at least by the looks of things!).

Where you might get into trouble is this:

var myTrumpet = new Instrument { Id = 5 };
var trumpetInfo = GetInstrumentInfo(myTrumpet);
trumpetInfo.Name = "Trumpet";
SetInstrumentInfo(myTrumpet, trumpetInfo);

Here you are updating the item in place, so I wouldn't be terribly surprised to see caching occur. I am not sure if it really will, though. MSIL includes explicit opcodes for reading and writing array elements, so the semantics may be different than standard heap reads and writes.

answered Aug 14 '12 at 19:15

Chris Shain

50,833
6
93
125

1

Adding volatile on the array itself would only make assignments and reads to/of the instrumentInfos volatile. If he only ever assigns that field once (which is what he posted) then that will do nothing to help thread-safety, i.e. does nothing with regard to making element replacement on the array more or less thread-safe. – Peter Ritchie Aug 14 '12 at 19:23
i.e. the generated stelem.ref to replace an element in the array will not use a memory address specified volatile with the .volatile opcode. – Peter Ritchie Aug 14 '12 at 19:27
No, but the question of *which address will be loaded* is still dependent upon what address is stored in the array, which is affected by the acquire/release semantics of the ***array itself***. We all know that if he is altering InstrumentInfo in-place it isn't thread safe. The question is really "if I store an item at position X in an array, will another thread read that item or the item that *used to be* at position X?" – Chris Shain Aug 14 '12 at 19:30
Furthermore, he only assigns the field `instrumentInfos` once but he *clearly* mutates it (in the `SetInstrumentInfo` method). Therefore, it is the mutations to `instrumentInfos` that are in question here, not the mutations to the elements. – Chris Shain Aug 14 '12 at 19:34
1

`volatile` would not affect the array itself, it would only affect the variable. i.e. decorating the variable with `volatile` does not give array element accesses any acquire/release semantics. – Peter Ritchie Aug 14 '12 at 20:00
1

He does not mutate the field `instrumentInfos` in SetinstrumentInfo, he mutates *the memory that `instrumentInfos` references*. That's a very different and important distinction. – Peter Ritchie Aug 14 '12 at 20:02
InstrumentInfo class is immutable by design. that's why I have setter/getter. And I also want to avoid locks if possible, that's why InstrumentInfo is immutable. – Oleg Vazhnev Aug 14 '12 at 20:23
@PeterRitchie maybe I am missing something blindingly obvious here but if I mark the `instrumentInfos` field as volatile then every access of that field is going to read in the data stored in the array from main memory, and every write is going to flush to main memory. – Chris Shain Aug 14 '12 at 21:09
The fact that the address of the array itself isn't changing is absolutely besides the point- `volatile` is going to ensure that modifications to the *content* (that is the list of references) of the array are visible on any thread. Which, given that we now know that `InstrumentInfo` is a class (and an immutable one at that), means that volatile is sufficient. – Chris Shain Aug 14 '12 at 21:09
@ChrisShain right, every access to the *field*. The only accesses in `SetInstrumentInfo` are to what that field points to. i.e. they don't write a new value to the field. Its similar to saying if I had `volatile MyClass field = new MyClass()` that `field.MyIntProperty = 10;` is a volatile operation. It's not. – Peter Ritchie Aug 14 '12 at 21:25

How to guarantee that an update to "reference type" item in Array is visible to other threads?

8 Answers8