1

I am looking to implement a map of simple struct with 2 writers and multiple readers. To make it thread-safe, I am currently using mutex with unique locks (my code below). There will only be insert and edit operations on the structs and no erase.

However, because write operations will be much more frequent than read operations (around ~1m write operations vs 15-20,000 read operations in the same time period), I am worried that use of unique locks will affect the performance of the read operations.

I am wondering if there is a better / simpler way to implement a simple concurrent map for my problem? For example, would it be better to lock just the struct itself rather than the entire map each time a write / read operation is performed?

I have looked into Intel TBB concurrent unordered map and Pershing's Junction map but would prefer to stick with just using the STL if it is possible.

I am using C++11 so shared_lock is not available. I contemplated using boost::shared_lock for the read operations, but I understand from this post that the cost of locking a shared_mutex is higher than that of a plain one.

EDIT: edited code post discussion in comments.

#include <mutex>

struct Foo
{
    double bar1 = 0;
    long bar2 = 0;
};

typedef std::shared_ptr<Foo> FooPtr;

class FooMap
{
    std::mutex mMutex;
    std::unordered_map<long, FooPtr> mMap;

public:
    FooPtr get(long const& key) 
    {    
        return mMap[key];
    }

    void set(long const& key, double a, long b) 
    {
        std::unique_lock<std::mutex> lock(mMutex);
        mMap[key]->bar1 = a;
        mMap[key]->bar2 = b;
    }

};

L.Koh
  • 109
  • 1
  • 8
  • This really depends what the Foo payload is. Ideally, make it a shared_ptr, that way, the content from get will stay live while the caller keeps the returned shared pointer, even if the setter tries to stomp it. Your setter should do a single look-up under the mutex, then build the new content and then stomp the payload shared_ptr to the new content. This gives you some performance gains and minimises the lock duration. – Gem Taylor Aug 21 '18 at 11:50
  • @GemTaylor Copying `Foo` may be cheaper than copying `shared_ptr`, because the latter involves an atomic increment. I would generally advise against using `shared_ptr` in performance sensitive code. – Maxim Egorushkin Aug 21 '18 at 13:27
  • I don't disagree totally, but we see already that foo has 2 members which are already being set expensively. – Gem Taylor Aug 21 '18 at 13:41
  • Maybe you can try this: each writer uses its own thread-local copy of the map, so there's no contention between the writers. When a reader needs to read (rarely) both writers are stopped and the two copies are merged. Of course it won't work in all circumstances. – n. m. could be an AI Aug 21 '18 at 15:19
  • Thanks all. @GemTaylor Foo numbers unlikely to exceed 20,000,000. I understand a shared_ptr is not thread-safe. In that case, when the reader is reading, wouldn't the values change while if the writer is also writer with the same shared_ptr? – L.Koh Aug 21 '18 at 15:22
  • No, shared_ptr is thread-safe - that's the whole point of it - it is shared! The previous incarnation of smart_ptr was frankly rubbish. – Gem Taylor Aug 21 '18 at 15:28
  • The trick here is that when the reader "takes" the shared_ptr it increases the ref count. Now, it doesn't matter what the writer does in the map, that shared object won't be deleted until the reader releases it. If there is no contention, you have saved an object copy. If there is contention then you temporarily consume memory.Effectively, it brings in a "cheap" per-node lock. – Gem Taylor Aug 21 '18 at 15:32
  • @GemTaylor sorry if I am asking very basic questions as I am new to C++, but this post seems to suggest that shared_ptr is not thread safe? https://stackoverflow.com/a/47117985/9514363 – L.Koh Aug 21 '18 at 15:40
  • And this as well: https://stackoverflow.com/a/9133225/9514363 – L.Koh Aug 21 '18 at 15:46
  • The _construction_ of a shared_ptr "pointer" is not itself thread-safe, but it generally doesn't have to be. Nothing knows about the pointer until you offer it to the world. Once the shared_ptr exists then the thing it points to is completely thread-safe. In this case, the shared_ptr instances are either locally owned by the calling code or constructed as item members in the map. The ones in the map are protected during construction by your mutex. The only problem occurs if, say you had a global shared_ptr "pointer" instance and 2 thread try to construct it or set it at the same time. – Gem Taylor Aug 21 '18 at 16:20
  • @GemTaylor Thanks for your detailed explanation it is clear now and I have edited the code accordingly. – L.Koh Aug 21 '18 at 16:45
  • @L.Koh That 'get' you wrote in FooMap inserts an entry in your map if the key is missing. Also, you might want to create your FooPtr and assign its values before getting the lock and putting it inside the map, just to make sure you stay 'locked' as less as possible – AlexG Aug 21 '18 at 17:40
  • 1
    @GemTaylor No, *the thing it points to* is not automatically thread-safe, nor is the pointer itself. The only thread-safe thing is its reference counter. You can copy and destroy `shared_ptr`s that live in different threads and point to the same object, and the reference count will update correctly. No more, no less. – n. m. could be an AI Aug 21 '18 at 20:58
  • OK I do take your point that the content is not magically thread safe. The thing that makes it magically thread safer in this case is that the writer creates a new instance when it modifies the map entry. This makes it thread safe, so long as the readers don't modify it, which they couldn't do before because they had a copy. – Gem Taylor Aug 22 '18 at 10:50
  • 1
    @GemTaylor The code as it stands (rev. 6) has a number of problems. 1) No `Foo` is ever created so the `FooPtr`s are all null. 2) By returning a pointer from `get`, the reader can access `bar1` and/or `bar2` while the `set` method may be simultaneously modifying them. 3) The calls to `FooMap::operator[]` in `get` and `set` may be modifying the map simultaneously since no lock is taken in `get`. @L.Koh I'd suggest reverting the changes to the code. – Andrew Durward Aug 22 '18 at 11:30
  • If 1mln operations in a sec go for a lock-free struct with statically allocated memory. Think make sense if you could put real-live class/member names. There may be other approach that could suit you. Would data split make sense? You can split into two separate write threads (first will use key%2==0, second key%2==1) then it will be SWMR maps. Would also question if map is what you need at the end. Maybe std::deque could do (faster as no hash)? One issue you find with std::unordered_map is rehash. Make sense if you could elaborate how you gonna erase. – M.L. Aug 24 '18 at 15:05

2 Answers2

1

Ok, I give up... I'n afraid you code has drifted a long way from my thoughts. This is what I'm really suggesting, where you write into a new instance of the shared object to create the new foo:

#include <mutex>
struct Foo
{
  double bar1 = 0;
  long bar2 = 0;
};

typedef std::shared_ptr<Foo> FooPtr;
typedef std::shared_ptr<const Foo> CFooPtr;

class FooMap
{
  std::mutex mMutex;
  std::unordered_map<long, FooPtr> mMap;

public:
  CFooPtr get(long const& key) 
  {    
    auto found = mMap.find(key);
    if (found == mMap.end())
    {
      return nullptr;
    }  
    return found->second;
  }

  void set(long const& key, double a, long b) 
  {
    FooPtr tmp = make_shared<Foo>({a,b});
    {
      std::unique_lock<std::mutex> lock(mMutex);
      mMap[key].swap(tmp);
    }
    // previous entry contents are destroyed outside mutex
  }
};
Gem Taylor
  • 5,381
  • 1
  • 9
  • 27
  • okay I understand now and thanks for laying out the code! sorry I misunderstood what you meant previously. – L.Koh Aug 23 '18 at 02:05
0

Here you can find design of lock-free hash map with STL only at https://shlomisteinberg.com/2015/09/28/designing-a-lock-free-wait-free-hash-map/

Claim to be better than TBB on heavy writes.

M.L.
  • 728
  • 4
  • 12