Does Multiple reader single writer implementation in g++-4.4(Not in C++11/14) via boost::shared_mutex impact performance?

Question

Usage: In our production we have around 100 thread which can access the cache we are trying to implement. If cache is missed then information will be fetched from the database and cache will be updated via writer thread.

To achieve this we are planning to implement multiple read and single writer We cannot update the g++ version since we are using g++-4.4

Update: Each worker thread can work for both read and write. If cache is missed then information is cached from the DB.

Problem Statement: We need to implement the cache to enhance the performance. For this, cache read are more frequent and write operations to the cache is very much less.

I think we can use boost::shared_mutex boost::shared_lock, boost::upgrade_lock, boost::upgrade_to_unique_lock implementation

But we learnt that boost::shared_mutex has performance issues:

Questions

Does boost::shared_mutex impact the performance in case read are much frequent?
What are other constructs and design approaches we can take while considering compiler version g++4.4?
Is there a way-around on how to design it, such that reads are lock free?

Also, we are intended to use Map to keep the information for cache.

score 3 · Answer 1 · answered Feb 02 '18 at 09:08

3

You need to profile it.

In case you're stuck because you don't have a "similar enough" environment where you can actually test things, you can probably write a simple wrapper using pthreads: pthread_rwlock_t

pthread_rwlock_rdlock
pthread_rwlock_wrlock
pthread_rwlock_unlock

Of course you can design things to be lock free. Most obvious solution would be to not share state. (If you do share state, you'll have to check that your target platform supports atomic instructions). However, without any knowledge of your application domain, I feel very safe suggesting you do not want lock-free. See e.g. Do lock-free algorithms really perform better than their lock-full counterparts?

answered Feb 02 '18 at 09:08

sehe

374,641
47
450
633

std::unordered_map are intended to be used for Cache. Yes you are correct state will be not shared between worker threads. Each worker thread work for both read and write. If cache is missed then information is cached from the DB. – Ajay yadav Feb 05 '18 at 08:33
Sounds like a classic case for reader-writer locks. You should just profile it. The article you linked is from 2011. That's before compilers supported the standard library http://en.cppreference.com/w/cpp/thread/shared_mutex or even http://en.cppreference.com/w/cpp/thread/shared_timed_mutex. You can safely bet that many things have been improved. If you shared the code, we could help with the benchmark. Sounds like because it's config info (?) it's not on any kind of hot path anyways – sehe Feb 05 '18 at 09:02
Unfortunately we can not upgrade the system to use the C++11/14/17. Cache constitute attributes of file systems checked before copy from one file system to other. This is less likely to change but in production such scenario are possible. Currently each time they get fetched. These threads exhaust CPU during the heavy load. Its design need to be improved to reduce CPU usage. – Ajay yadav Feb 05 '18 at 09:24
I never said anything about upgrading. I said something about the age of the article. "Cache constitute attributes of file systems checked before copy from one file system to other." - that's mumbo jumbo. Code talks. In your question you mentioned a database. Which is it? We can't help if we don't know the problem. – sehe Feb 05 '18 at 09:42
Its database only having properties of Filesystem. Sorry if you are misguided. You can consider as database only. – Ajay yadav Feb 05 '18 at 09:50

score 3 · Accepted Answer · edited Feb 09 '18 at 06:03

3

If writes were non-existent, one possibility can be 2-level cache where you first have a thread-local cache, and then the normal cache with mutex or reader/writer lock.

If writes are extremely rare, you can do the same. But have some lock-free way of invalidating the thread-local cache, e.g. an atomic int updated with every write, and in those cases clear the thread-local cache.

edited Feb 09 '18 at 06:03

Rohit Sharma

1,271
5
19
37

answered Feb 02 '18 at 09:10

Hans Olsson

11,123
15
38

Thanks for answering. It is more toward my problem solution. – Ajay yadav Feb 09 '18 at 03:36

score 2 · Answer 3 · answered Feb 02 '18 at 10:33

2

It all depends on the frequency of the updates, the size of the cache and how much is changed in the update.

Let's assume you have a rather big cache with a lot of changes on each update. Then I would use a read-copy-update pattern, which is lock-free.
If your cached data is pretty small and one time read (e.g. a single integer) RCU is also a good choice.
A big cache, with small updates or a big cache with updates which are to frequent for RCU a Read-Write Lock is a good choice.

answered Feb 02 '18 at 10:33

rbf

541
4
16

As I am using boost::unordered_map, it can have 8 to around 100 entry on different system configuration. Read happens very frequently e.g few read in a second. Write only happen in some fail-over/new information addition like system scaling, Which are less likely but can happen. – Ajay yadav Feb 05 '18 at 03:30

StarShine · Answer 4 · 2018-02-08T13:04:11.643

Alongside other answers suggesting you profile it, a large benefit can be had if you can somehow structure or predict the type, order and size of the requests.

If particular types of data are requested in a typical cycle, it would be better to split up the cache per data type. You will increase cache-hit/miss ratios and the size of each cache can be adapted to the type. You will also reduce possible contention.
Likewise, the size of the requests is important when choosing your update approach. Smaller data fragments may be stored longer or even pooled together, while larger chunks may be requested less frequently.
Even with a basic prediction scheme in place that covers only the most frequent fetch patterns, you may already improve performance quite a bit. It's definitely worth it to try and train e.g. a NN (Neural Network) to guess the next request in advance.

Does Multiple reader single writer implementation in g++-4.4(Not in C++11/14) via boost::shared_mutex impact performance?

4 Answers4