how can I ensure "nice" (or at least "ok") cache behavior with a volatile mutex'd array?

Question

I have two threads which share a large array of data. One thread writes to it, and the other reads from it. Because the array cannot be in an "incompletely-updated" state when read, I mutex all array operations (reads/writes).

I also try to "play nicely with the cache"- when I read/write large amounts of data, I get one mutex, and read/write as much as required in sequence, then relinquish the mutex.

[edit: to clarify, this is the cache behavior I would like to preserve. if you read/write large swathes of memory in sequence, then the cache can pull in large lines of data from memory (slow!) only once, then operate on the cache (fast!) without again hitting memory until that cache line is exhausted.]

One thing I would like to protect against is "writing to a small part of the array in one thread, and then in another thread (after receiving the mutex) reading from that small part of the array which hasn't yet been flushed to memory (out of the first thread/core's cache), resulting in an outdated read". So the solution would be to mark the array as "volatile" (right?).

Am I correct to worry that "marking the array as volatile" will totally kill my ability to read/write large chunks in accordance with a well-behaved cache? Or will every read/write be called to/from memory?

In a perfect world, what I think I'd want is the ability to: 1. grab a mutex, 2. load data from memory (as though it were volatile), 3. read/write to array (as though it weren't volatile- should be safe to rely on own cache bc mutex), 4.(in the case of write) flush any remaining cache to memory. 5. relinquish mutex

Can I accomplish this? Are there any glaring misunderstandings here on my part?

Do you have a reason to think that this is an issue? This sounds a lot like a case of premature optimization. Also, a variable being volatile has nothing to do with cache, it's about how the compiler chooses when to reuse a value read that should not have changed from the currently running code's perspective. — Thomas Jager, Feb 07 '20 at 17:06
re volatile: ah ok that makes sense. so then the issue of "an outdated variable existing in another thread's cache" just... isn't an issue (assume the variable is mutex'd)? if so- then great! (you're right this question will have been a case of premature optimization [maybe more like "premature safety concern"]- but at least I learned something!) thanks! — Phildo, Feb 07 '20 at 17:09
I'm not particularly knowledgeable about this, but this answer agrees with my assumptions: https://stackoverflow.com/a/30968557/5567382 — Thomas Jager, Feb 07 '20 at 17:12
Data in an array, once written, is _in memory_. What do you mean by: _which hasn't yet been flushed to memory_ — ryyker, Feb 07 '20 at 17:24
@ryyker I had meant "in a CPU cache". Or even "in a CPU register" (<- though that one is only a problem if one thread is still in some small section of a stack frame, which is prevented by the mutex). Am I wrong in this understanding? — Phildo, Feb 07 '20 at 17:42
@Phildo You're overthinking it. The mutex should be sufficient. No need for volatile in this case. — Ian Abbott, Feb 07 '20 at 17:50
@ThomasJager ok actually that linked answer agrees with _my_ assumptions: I'm not worried about context switching (the assumed case is two independent cores). The answerer goes on to say that "[volatile] tells the compiler [the memory] must be accessed each time the source code dictates it", which would kill the cache with individual accesses (as I feared). — Phildo, Feb 07 '20 at 17:50
@IanAbbott that very well may be, but I don't understand how/why. is there a resource you could link to (or a short explanation) showing how CPU caches retain consistency across cores without constantly dumping themselves? — Phildo, Feb 07 '20 at 17:52
@Phildo It's called "cache coherency", but that's only half the story. As you say, you need to worry about data already being in registers due to out of order execution etc. But the mutex lock semantics include the use of memory barriers at both the inter-CPU level (instruction "lock" prefixes) and the compiler level (enforcing C sequence point semantics around the mutex). — Ian Abbott, Feb 07 '20 at 17:59
@Phildo There is a discussion of multi-threaded execution and data races in section [5.1.2.4](http://port70.net/~nsz/c/c11/n1570.html#5.1.2.4) of the C11 spec. It relates specifically to C11 standard threads, but the concepts are the same for GCC pthreads or Windows threads. — Ian Abbott, Feb 07 '20 at 18:10
@Phildo Also worth a read, although its aimed at Linux kernel developers: [Why the “volatile” type class should not be used](https://www.kernel.org/doc/html/latest/process/volatile-considered-harmful.html#why-the-volatile-type-class-should-not-be-used). — Ian Abbott, Feb 07 '20 at 18:21
The real value of the `volatile` modifier is when your process has no idea when a variable will be written to. i.e. the variable has to be created in such a way that it can accept updates from something outside the process at any time. It is therefore incumbent on your process to _read_ the value of that variable often. Neither `volatile` or `multiple threads` are a necessary component to do what you have described. (And FWIW, contrary to the previous comment, there are places where `volitile` variables are invaluable when used with hardware monitoring.) — ryyker, Feb 07 '20 at 18:55
As an aside, simple arrays of variables can be used to avoid having to use mutex, or any other kind of thread safe feature to keep reading and writing values from different threads safe. [Read more here](https://stackoverflow.com/questions/31548355/threading-and-thread-safety-in-c) — ryyker, Feb 07 '20 at 19:04
@ryyker maybe I'm confused but the thrust of that whole question seems wrong: 1. you _cannot_ just have some simple non-mutex-protected "token" variable as a substitute for mutexes if the contents of shared memory in any way change in real time. also, that person's question involves accessing _different parts of memory_ (though in the "same" array...) for each thread, in which case they don't need _any_ token? I can't tell if there's something fundamental I'm missing, or if we're talking about two different problems, or... ? — Phildo, Feb 07 '20 at 20:30
@Phildo - I get the same sense. Several comments seem to be talking past each other in an attempt to answer a very amorphous question. In a nut shell though, from what ***I*** gather, 1) `volatile` has no place in the conversation. i.e. the compiler has no need to create a variable to accommodate a variable for which the value can change at any time--without any action being taken by the code the compiler finds nearby. 2) if the threads already write to different locations, there is no need for tokens, thread-safe variables, critical sections, mutexes etc._We Just need a good question!_... — ryyker, Feb 07 '20 at 21:35

how can I ensure "nice" (or at least "ok") cache behavior with a volatile mutex'd array?

0 Answers0