7

I have a variable int foo that is accessed from two threads. Assuming I have no race-condition issues (access is protected by a mutex, all operations are atomic, or whatever other method to protect from race conditions), there is still the issue of "register caching" (for lack of a better name), where the compiler may assume that if the variable is read twice without being written in between, it is the same value, and so may "optimize" away things like:

while(foo) { // <-may be optimized to if(foo) while(1)
  do-something-that-doesn't-involve-foo;
}

or

if(foo) // becomes something like (my assembly is very rusty): mov ebx, [foo]; cmp ebx, 0; jz label;
  do-something-that-doesn't-involve-foo;
do-something-else-that-doesn't-involve-foo;
if(foo) // <-may be optimized to jz label2;
  do-something;

does marking foo as volatile solve this issue? Are changes from one thread guaranteed to reach the other thread?

If not, what other way is there to do this? I need a solution for Linux/Windows (possibly separate solutions), no C++11.

Baruch
  • 20,590
  • 28
  • 126
  • 201
  • I remeber seeing a quote from the standard saying your not supposed to use volatile for threads anymore, in c++, I'll try to find it – aaronman Aug 08 '13 at 20:07
  • 3
    http://stackoverflow.com/questions/4557979/when-to-use-volatile-with-multi-threading – Scotty Bauer Aug 08 '13 at 20:08
  • 1
    The problem here is "Assuming I have no race-condition issues". That's an **ENORMOUS** assumption. – Dietrich Epp Aug 08 '13 at 20:10
  • 4
    Doesn't `std::atomic` do what you want? – Ben Voigt Aug 08 '13 at 20:11
  • I always used volatile to ensure no "register caching" is done, and it always work. – rbelli Aug 08 '13 at 20:12
  • @Dietrich - No race condition can be enforced through the locking ptr class - check this out: http://www.drdobbs.com/cpp/volatile-the-multithreaded-programmers-b/184403766 – MasterPlanMan Aug 08 '13 at 20:18
  • @DietrichEpp No race conditions can be applied with atomic operations on the variable (a read or write to an `int` should be atomic, or using atomic processor/OS instructions) – Baruch Aug 08 '13 at 20:22
  • @rbelli: You've just been lucky. There's no guarantee. – GManNickG Aug 08 '13 at 21:18
  • The only way you could possibly tell if the compiler is caching the value of `foo` in a register is if you have "race condition issues." – Casey Aug 08 '13 at 21:18
  • @baruch: Whether or not you think `int` *should* behave atomically doesn't mean it actually will. (Note, it doesn't!) Just use `atomic` and be done with it. There's a reason it was added to the language; if existing facilities were enough, it wouldn't be there. – GManNickG Aug 08 '13 at 21:19
  • @GManNickG I am using a compiler without C++11 support (VS2010). Can you explain/link to why `int` reads aren't atomic? – Baruch Aug 10 '13 at 19:02
  • @baruch: Check out Boost.Atomic then, or Intel's TBB, or the implementations of each in the worst case. And because atomicity is expensive in general, so the C++ language has no reason to enforce it on every single `int`. Maybe on *your* platform it's free (and an `atomic` would take advantage of this), but C++ has to work on many platforms, so the language doesn't make any guarantees. Instead it provides an opt-in tool. – GManNickG Aug 10 '13 at 19:13
  • @GManNickG I only care about x86/x64 with this project – Baruch Aug 10 '13 at 19:16
  • @baruch: That's fine, just make sure your project has safe-guards for it it ever gets ported. Note, though, that making arguments on the basis of a particular platform usually leads to difficult code. You're going to wind up implementing some form of `atomic` so you can get the memory fences you need, and at that point you might as well just import some already-tested and portable library. – GManNickG Aug 10 '13 at 19:19
  • @GManNickG: The volatile keyword assure that all the writes to a variable is written in the memory, and not in a register for posterior write (optimization), also for the reads. As I said I use volatile to avoid register caching, and for it, it works. I didn't use it for synchronization, it is because it doesn't enforce the write or read is atomic. (http://stackoverflow.com/questions/4437527/why-do-we-use-volatile-keyword-in-c) – rbelli Aug 11 '13 at 23:09

3 Answers3

11

What you need are memory barriers.

MemoryBarrier();

or

__sync_synchronize();

Edit: I've bolded the interesting part and here's the link to the wiki article (http://en.wikipedia.org/wiki/Memory_barrier#cite_note-1) and the associated reference (http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.07.23a.pdf)

Here's the answer to your other question (from wikipedia): In C and C++, the volatile keyword was intended to allow C and C++ programs to directly access memory-mapped I/O. Memory-mapped I/O generally requires that the reads and writes specified in source code happen in the exact order specified with no omissions. Omissions or reorderings of reads and writes by the compiler would break the communication between the program and the device accessed by memory-mapped I/O. A C or C++ compiler may not reorder reads and writes to volatile memory locations, nor may it omit a read or write to a volatile memory location. The keyword volatile does not guarantee a memory barrier to enforce cache-consistency. Therefore the use of "volatile" alone is not sufficient to use a variable for inter-thread communication on all systems and processors[1]

Check this one out, it provides great explanations on the subject: http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2 http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2

MasterPlanMan
  • 992
  • 7
  • 14
9

If access is protected by a mutex, you do not have any issue to worry about. The volatile keyword is useless here. A mutex is a full memory barrier and thus no object whose address could be externally visible can be cached across the mutex lock or unlock calls.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
1

The volatile keyword was originally introduced to indicate that the value can be changed by the hardware. This happens when hardware device has memory mapped registers or memory buffers. The right approach is to use it only for this purpose.

All modern synchronization language constructs and synchronization libraries are not using this keyword. Application level programmers should do the same thing.

Kirill Kobelev
  • 10,252
  • 6
  • 30
  • 51