2

Say I have some class whatever nature it might have. I want to share an object of this class between threads. In the past, I would have thought that a mutex - while it might not be the most efficient way - would be enough to make sure that everything works.

Now, I have read a bit about std::atomic and that it is necessary even for simple bool references: Do I have to use atomic<bool> for "exit" bool variable?

While I understand why a bool should be atomic, I do not understand how a simple mutex prevents the following issue:

Second, when two threads run on different cores, they have separate caches; writing a value stores it in the cache, but doesn't update other caches, so a thread might not see a value written by another thread.

Is a mutex not merely a mechanism that makes sure that some other thread is not be able to lock the mutex? But within the mutex area I might play around with a whole bunch of variables. The compiler might not know which variables are involved.

As a consequence, simply putting a mutex-based lock around all areas that contain shared resources does not seem sufficient to me at the moment. Could it not still be that the two threads have different versions of this resource because the thread caches just will not be updated?

IceFire
  • 4,016
  • 2
  • 31
  • 51
  • Does [this](https://stackoverflow.com/questions/6837699/are-mutex-lock-functions-sufficient-without-volatile) answer your questions? – default Jun 30 '17 at 11:40
  • 1
    @Default not quite. It says that before C++0x there is no guarantee but it does not give details about how C++0x helps. manni66's answer is more insightful – IceFire Jun 30 '17 at 11:43
  • There is also the memory fence that is used as a kind of sequence point by the CPUs to synchronize memory. So you have a mechanism to synchronize the memory and caches of all cores and threads along with a synchronization primitive, the mutex. See https://stackoverflow.com/questions/7280119/ for an explanation. – Richard Chambers Jun 30 '17 at 12:01

1 Answers1

1

The C++ storage model guarantees that changes to objects in one thread are visible to other threads if they are protected by a mutex. See http://en.cppreference.com/w/cpp/language/memory_model for details.

  • So, someone keeps track of what variables are within mutex area? I mean, this has to be at runtime, I guess, so does it not cost a lot of time? – IceFire Jun 30 '17 at 11:29
  • @IceFire yes, it is expensive. It is best to avoid sharing data. –  Jun 30 '17 at 11:33
  • @IceFire - doesn't really keep track, what is does is cause a system wide "flush" of the caches (CPU memory caches etc) - ie whatever has changed will now get seen because everything is now dirty and need to be refreshed. – Richard Critten Jun 30 '17 at 11:43
  • @RichardCritten This seems even more bothersome. When I use a mutex, the entire cache gets dirty? This would literally disable the cache. And if it is limited to the variables within the mutex area, some tracking still seems to be necessary, how else would the runtime know which variables to un-cache? – IceFire Jun 30 '17 at 11:46
  • @IceFire exactly that is why locks are said to be expensive. Imagine a computer with 2 physical CPUs level-1 cache on the chip (per CPU) and level-2 and level-3 on the memory controller. When exiting a lock all these caches need to sync'd or else threads on different CPU would see the old data. – Richard Critten Jun 30 '17 at 11:50
  • @RichardCritten But this would make locks even disastrous and any parallel algorithm with mutexes useless. It does not seem that this is *that* bad in my tests. Do you have any source for me to read details? – IceFire Jun 30 '17 at 11:53
  • 1
    @manni66 I have read the memory_model article but it does not say anything about cache copies. It just says that a mutex clearly sets a memory order, so that there is no data race. But the reasons for the data race are not stated or explained and also not how a mutex addresses them. Do you have a more detailed source or can you add a more detailed explanation? – IceFire Jun 30 '17 at 11:53
  • @IceFire start with the keyword `MFENCE` for Intel ( http://x86.renejeschke.de/html/file_module_x86_id_170.html ) it's not always necessary but searching on it will produce a lot of articles. – Richard Critten Jun 30 '17 at 12:01
  • @IceFire if you really want to dig deep you have to follow the links in the article, read the standard and consult google. Maybe https://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Herb-Sutter-Concurrency-and-Parallelism will help. –  Jun 30 '17 at 12:03
  • @RichardCritten, IIRC, the system-wide flush of the caches is really a flush of the _cache-lines_ containing the memory address, not the entirety of L1 L2 etc... But that still impacts every thread currently using those cache-lines. – Chris O Jun 30 '17 at 13:57