1

I met this problem when I tried to solve an concurrency issue in my code. In the original code, we only use a unique lock to lock the write operation on a cache which is a stl map. But there is no restrictions on read operation to the cache. So I was thinking add a shared lock to the read operation and keep the unique lock to the write. But someone told me that it's not safe to do multithreading on a map due to some internal caching issue that it itself does.

Can someone explain the reason in details? What does the internal caching do?

Weitao Wang
  • 189
  • 1
  • 2
  • 11
  • That doesn't seem to be a concern here: http://stackoverflow.com/questions/8464458/c-pthread-how-to-make-map-access-threadsafe – Kirk Backus Aug 22 '13 at 15:16
  • The **second** answer in this link provides some interesting algorithmic information: http://stackoverflow.com/questions/7078785/design-problem-thread-safety-of-stdmap – Kirk Backus Aug 22 '13 at 15:19
  • @KirkBackus: Re the first link: concurrent reads are safe as long as there are no writes going on. That is, having a writer's lock that readers ignore will let the writer modify the map while a reader is accessing it (and viceversa) – David Rodríguez - dribeas Aug 22 '13 at 15:42
  • `map` is not thread safe by default because it is a costly thing to ensure, and not all uses cases require it to be thread safe. If you need it to be safe, then you need to protect read and write access to it (as per @James Kanze's answer) – josh poley Aug 22 '13 at 15:45

4 Answers4

6

The implementations of std::map must all meet the usual guarantees: if all your do is read, then there is no need for external synchrionization, but as soon as one thread modifies, all accesses must be synchronized.

It's not clear to me what you mean by "shared lock"; there is no such thing in the standard. But if any one thread is writing, you must ensure that no other threads may read at the same time. (Something like Posix' pthread_rwlock could be used, but there's nothing similar in the standard, at least not that I can find off hand.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • There should be a `shared_mutex` (read/write mutex) in C++14. In the meantime, there's always [Boost](http://www.boost.org/doc/libs/release/doc/html/thread/synchronization.html#thread.synchronization.mutex_types.shared_mutex) – Mike Seymour Aug 22 '13 at 16:43
  • Thanks for correcting me. @MikeSeymour is right, it's shared_mutex from boost. Probably I didn't make my question clearly. What I don't understand is NOT how to make it thread safe, but why it's not multithread read safe. So to make the question simple, consider the following scenario, there is no write operations, is it safe that several threads read the map simultaneously? – Weitao Wang Aug 22 '13 at 20:03
  • @MikeSeymour If it's shared, it's not a mutex. Is it a read-write lock? – James Kanze Aug 22 '13 at 22:14
  • @WeitaoWang If no thread is modifying, then _you_ don't need to protect the accesses. – James Kanze Aug 22 '13 at 22:21
  • @JamesKanze: Yes, it's what some call a read-write lock. It seems to me like a sensible enough name for something that can be either shared or mut(ually) ex(clusive); certainly, "lock" would be worse (in the context of C++ threading library) since that word already has an established meaning. But there's no point arguing with me about it, since I have nothing to do with the C++ committee. – Mike Seymour Aug 23 '13 at 10:48
  • @MikeSeymour The "ex" in mutex is for "exclusion"; if it's shared, it's not exclusive. (But of course, it's just a name. And "mutsh" (for mutual shared) would be rather horrible.) – James Kanze Aug 23 '13 at 15:03
  • @JamesKanze: Of course it's exclusive: shared owners exclude unique owners, and unique owners exclude everyone else. Unless you insist that "exclusive" can only mean "everyone excludes everyone else"; but most of the English-speaking world would disagree. But there's no point arguing with me about it, since I have nothing to do with the C++ committee. – Mike Seymour Aug 23 '13 at 15:18
1

All const member functions of std types can be safely called from multiple threads in C++11 without explicit synchronization. In fact, any type that is ever used in conjunction with the standard library (e.g. as a template parameter to a container) must fulfill this guarantee.

Clarificazion: The standard guarantees that your program will have the desired behaviour as long as you never cause a write and any other access to the same data location without a synchronization point in between. The rationale behind this is that modern CPUs don't have strict sequentially consistent memory models, which would limit scalability and performance. Under the hood, your compiler and standard library will emit appropriate memory fences at places where stronger memory orderings are needed.

mionic
  • 41
  • 4
  • That is false. If any thread modifies the container, _all_ accesses (including those through `const`) must be protected. (This is, of course, the normal thread safety guarantee: all of the containers in the standard library are thread safe in the sense that they define a contract, and if you meet that contract, there will be no race conditions. And if you don't it is undefined behavior.) – James Kanze Aug 22 '13 at 15:27
  • Of course, you may not cause a data race by calling a non-const member function without synchronization to prevent other simultaneous accesses. That would cause undefined behaviour. – mionic Aug 22 '13 at 15:30
  • But you need to synchronize the const accesses as well, if any thread is using a non-const function. – James Kanze Aug 22 '13 at 15:33
  • Yes, how would you reword my answer to make that more obvious? – mionic Aug 22 '13 at 15:35
  • By just describing the exact contract. (When it comes down to it: thread safety doesn't really mean much more than having a contract specifying what you need to do in a multithreaded environment.) – James Kanze Aug 22 '13 at 15:37
1

Since C++11 at least, a const operation on a standard library class is guaranteed to be thread safe (assuming const operations on objects stored in it are thread safe).

Angew is no longer proud of SO
  • 167,307
  • 17
  • 350
  • 455
  • 3
    And by "is thread safe": "so long as *all* threads are using `const`, you are fine. If any thread is using non-`const`, you are not fine, and you must synchronize." – Yakk - Adam Nevraumont Aug 22 '13 at 15:21
  • All operations are guaranteed to be thread safe, in the sense that the standard defines a contract with regards to threading, and if you meet that contract, there should be no problems. That contract says that if _any_ thread modifies the object (container), then _all_ accesses in all threads must be protected. – James Kanze Aug 22 '13 at 15:25
  • @Yakk I would say that if any thread is using a non-const, you're still fine, as long as all of the accesses are synchronized in some way. (This is how the standard defines thread safety. It's also really the only sensible way to define it.) – James Kanze Aug 22 '13 at 15:29
  • @JamesKanze I was just wanting to be clear: one meaning of "thread safe" is that "the `const` methods are automatically synchronized so that you never have to worry about any multi-threaded access. The non-`const` methods must be manually synchronized so that no more than one is being called at once", which is false, yet a consistent and reasonable interpretation of your post. – Yakk - Adam Nevraumont Aug 22 '13 at 15:59
  • @Yakk That's not a usable meaning of thread safe. The only usable meaning is that the object defines a contract with regards to threading. Your code has to conform to that contract. And as I said, the contract for the standard containers is clear: if _any_ thread calls a non-const function, _all_ accesses have to be synchronized. – James Kanze Aug 22 '13 at 22:13
1

I really don't see why there would be any caching issue...

If I refer to the stl definition of a map, it should be implemented as a binary search tree.

A binary search tree is simply a tree with a pool of key-value nodes. Those nodes are sorted following the natural order of their keys and, to avoid any problem, keys must be unique. So no internal caching is needed at all.

As no internal caching is required, read operations are safe in multi-threading context. But it's not the same story for write operations, for those you must provide your own synchronization mechanism as for any non-threading-aware data structure.

Just be aware that you must also forbid any read operations when a write operation is performed by a thread, because this write operation can result in a slow and complete rebalancing of the binary tree, i.e. a quick read operation during a long write operation would return a wrong result.

OOEngineer
  • 447
  • 3
  • 12
  • The standard doesn't forbid caching, and an implementation could conceivably cache the last element accessed, to speed up multiple accesses to the same element. The interface of `std::map` is designed so that you rarely need multiple accesses to the same element, so this is probably not a good idea, but it's legal. If the implementation does this, however, it must still ensure that there are no race conditions. – James Kanze Aug 22 '13 at 15:36
  • I see, so there can still be short-term internal caching, but with no impact in multi-threading context. However, it's probably best to forbid any read operation when a thread is performing a write operation on the map as this can result in a complete and slow rebalancing of the whole binary tree. – OOEngineer Aug 22 '13 at 15:39
  • Basically, the standard doesn't say anything about the implementation, so there can be any imaginable caching. What is required is that if no thread calls a non-const function, then no external synchronization is required; _if_ there is any internal change of state, then the implementation must arrange for the synchronization internally, so that the client code is unaware of it (except for the impact on performance). – James Kanze Aug 22 '13 at 15:42