How to minimize the mutex locking for an object when only 1 thread mostly uses that object and the other thread(s) use it rarely?

Question

Scenario

Suppose there are "Thread_Main" and "Thread_DB", with a shared SQLite database object. It's guaranteed that,

"Thread_main" seldom uses SQLite object for reading (i.e. SELECT())
"Thread_DB" uses the SQLite object most of the time for various INSERT, UPDATE, DELETE operations

To avoid data races and UB, SQLite should be compiled with SQLITE_THREADSAFE=1 (default) option. That means, before every operation, an internal mutex will be locked, so that DB is not writing when reading and vice versa.

"Thread_Main"    "Thread_DB"    no. of operation on DB
=============    ===========    ======================
something           INSERT             1
something           UPDATE             2
something           DELETE             3
something           INSERT             4
...                   ...             ...         (collapsed)
something           INSERT            500
something           DELETE            501
...                   ...             ...         (collapsed)
something           UPDATE            1000
something           UPDATE            1001
...                   ...             ...         (collapsed)
SELECT              INSERT            1200  <--- here is a serious requirement of mutex
...                   ...             ...         (collapsed)

Problem

As seen in above, out of 100s of operations, the need of real mutex is required only once in a while. However to safeguard that small situation, we have to lock it for all the operations.

Question: Is there a way in which "Thread_DB" holds the mutex most of the time, so that every time locking is not required? The lock/unlocks can happen only when "Thread_Main" requests for it.

Notes

One way is to queue up the SELECT in the "Thread_DB". But in larger scenario with several DBs running, this will slow down the response and it won't be real time. Can't keep the main thread waiting for it.
I also considered to have a "Thread_Main" integer/boolean variable which will suggest that "Thread_Main" wants to SELECT. Now if any operation is running in "Thread_DB" at that time, it can unlock the mutex. This is fine. But if no writeable operation is running on that SQLite object, then "Thread_main" will keep waiting, as there is no one in "Thread_DB" to unlock. Which will again delay or even hang the "Thread_Main".

It sounds like you want read/write locks, a bit like this answer: https://stackoverflow.com/questions/39185420/is-there-a-shared-lock-guard-and-if-not-what-would-it-look-like/39185510#39185510 — Galik, Dec 30 '19 at 06:34
Thanks @Galik, that post is partly useful. But in my case, the DB must not be writing when the `SELECT()` happens. From your answer, I am unable to get how that will be accomplished in a multithreaded environment. — iammilind, Dec 30 '19 at 06:51
@Galik: the OP wants to reduce lock/unlock overhead for the thread doing the mostly-writing. Potential parallel reads aren't the main issue. A readers/writers lock still requires uncontended writers to do an atomic operation to take the lock. (and readers, for that matter. All it gains you is multiple readers able to enter the critical section. They still have to bounce around the cache line containing the lock state itself. And it makes zero difference to the uncontended case.) — Peter Cordes, Dec 30 '19 at 06:53
Set priority for each task, so that highest priority task and hence the corresponding thread will get the lion's share. Need to use a priority queue and execute as package tasks. — seccpur, Dec 30 '19 at 06:54
@seccpur: I don't see why thread scheduling would be an issue with only 2 threads. But yes, bundling up multiple insert/update/delete ops and doing multiple ops inside one critical section could reduce locking overhead. The OP seems to have some very low latency requirement for the reader thread, though, since they say that having the main thread talk to the DB thread some other way for read requests could be too slow. So even holding the lock for longer might be a problem. I think we need more info on acceptable latencies and what (realtime?) OS and HW this is running on. — Peter Cordes, Dec 30 '19 at 07:03
@PeterCordes, you have got the problem right. There is no specific OS for this. It's supposed to be cross platform (Qt). Bundling multiple operations in 1 is not an option due to architectural demand. But in realtime, the DB thread will be operating on multiple (10s or 100s) of sqlite.db files. There is no guarantee that which sqlite.db out of those will request a `SELECT` operation. I am fine with small latency if it's guaranteed or consistent. Currently this example is with 1 DB. Plz let me know if more info needed., — iammilind, Dec 30 '19 at 07:07
An `atomic want_access` counter (maybe per DB) could maybe work: DB thread checks it after each operation, and unlocks + yields if it's non-zero. (Reading an atomic var that other threads haven't touched recently is much cheaper than unlocking + re-locking a mutex). This gives the main thread a chance to take the lock. (The DB thread privately keeps track of whether it left the mutex locked or not, so it knows before doing another DB operation whether it needs to take the mutex for that DB.) But if the DB thread is idle for a while, you'd need the main thread to signal it. — Peter Cordes, Dec 30 '19 at 07:12
If you have many DBs, you might want more than 1 DB thread to get some parallelism between updates to separate DBs. Like maybe up to 6 threads, and beyond that a single thread handles more than 1 DB. It sounds crazy to serialize all your database accesses by doing it all from a single thread. — Peter Cordes, Dec 30 '19 at 07:15
BTW, an uncontended mutex (unlocked and hot in L1d cache on the core taking the lock) is fairly cheap. Your critical section has to be very lightweight before it's worth worrying much about it. [avoid cost of std::mutex when not multi-threading?](//stackoverflow.com/a/59286955). Like just a `push_back()` on a `std::vector()` would justify trying to optimize it away when possible, but if SQLite update/delete/insert involve making sure the data commits to disk then it's massively slow compared to ~20 clock cycles on a modern x86. — Peter Cordes, Dec 30 '19 at 07:19
Have you benchmarked to check if such an optimization would have any measurable effect ? — , Dec 30 '19 at 08:53
@Peter, thanks for several good inputs. If you can put them into an answer, then it will be very useful. I will come up with more doubts. — iammilind, Dec 30 '19 at 09:34
@Yves, no benchmarks so far. I saw a very good scope of optimization. Depending on platform & OS, this may her different performances. — iammilind, Dec 30 '19 at 09:35
The cost of locking the mutex is insignificant when you don't have to wait for it. Don't bother with this. — Matt Timmermans, Dec 30 '19 at 14:02
@MattTimmermans, can you put this as an answer along with any references or bechmarks or link to those. Will help future visitors — iammilind, Dec 30 '19 at 14:59
It's still not clear why you can't have your DB thread respond to requests from the main thread (which it polls for with an atomic var, or as one of multiple FDs in a syscall it would do anyway), so you could keep all your DB access in one thread and use `SQLITE_THREADSAFE=0`. Your first "note" implies some very low latency or even hard or soft real-time requirements for the main thread which would seem to be incompatible with any ideas for having the main thread notify the DB thread that it wants a chance to access the DB, or to access the DB through the other thread. — Peter Cordes, Dec 30 '19 at 15:59
Without any benchmark and no idea of running times, you cannot see "a very good scope of optimization". — , Dec 30 '19 at 17:55
@PeterCordes, `SELECT` is associated with GUI and happens in main or worker thread. Hence can't wait too long. The other writes (`INSERT,UPDATE/DELETE`) happen in DB thread, which need not be real time. I am using `QThread` & figuring if I can overload their event queue mechanism. So whenever a `SELECT` happens, pause all operations in the DB thread and let it quickly run first. Except `SELECT` all other write accesses are in DB thread due to design. DB object is accessible from all the threads for read/write purpose. (For that, Qt has internal mutex based queue, which is fine & unavoidable.) — iammilind, Dec 31 '19 at 03:40
@YvesDaoust, [this answer](https://stackoverflow.com/a/32318127/514235) and the discussion underneath. It suggests that, how mutexes with `-pthread` enabled can bring down the performance. [This answer](https://stackoverflow.com/a/59286955/514235) gives some benchmarks. As per my this Qn, when we already know that most of the times the DB thread only accesses the DB all the times, it's trivially convincing enough to me that I should attempt for optimisation. Though I agree that the benchmarks are important. But since my app runs on multi platforms, where all show variations. — iammilind, Dec 31 '19 at 04:58

score 1 · Answer 1 · answered Dec 31 '19 at 21:37

Here's a suggestion: modify your program somewhat so that Thread_Main has no access to the shared object; only Thread_DB is able to access it. Once you've done that, you won't need to do any serialization at all, and Thread_DB can work at full efficiency.

Of course the fly in the ointment is that Thread_Main does sometimes need to interact with the DB object; how can it do that if it doesn't have any access to it?

The solution to that issue is message-passing. When Thread_Main needs to do something with the DB, it should pass a Message object of some sort to Thread_DB. The Message object should contain all the details necessary to characterize the desired interaction. When Thread_DB receives the Message object, Thread_DB can call its execute(SQLite & db) method (or whatever you want to call it), at which point the necessary data insertion/extraction can occur from within the context of the Thread_DB thread. When the interaction has completed, any results can be stored inside the Message object and the Message object can then be passed back to the main thread for the main thread to deal with the results. (the main thread can either block waiting for the Message to be sent back, or continue to operate asynchronously to the DB thread, it's up to you)

This was already suggested by a user in the comments above. However, as I [mentioned in reply](https://stackoverflow.com/questions/59526616/how-to-minimize-the-mutex-locking-for-an-object-when-only-1-thread-mostly-uses-t/59548609#comment105248401_59526616) and also in the 1st note already, the `SELECT` cannot be queued up in DB thread for the performance reasons. The result of that is directly passed on to the GUI and it shouldn't wait. In Linux mediocre machine, we observed that when there are too many DB instructions, the DB thread simply takes lot of time despite running at higher priority. — iammilind, Jan 01 '20 at 02:01
If the db thread is slow and you need to serialize access to the db, you’ll have the same delays in the shared-access design, with the main thread blocked waiting for the db thread to release the mutex. Perhaps a better approach would be to figure out how to make db operations quicker. — Jeremy Friesner, Jan 01 '20 at 14:35

How to minimize the mutex locking for an object when only 1 thread mostly uses that object and the other thread(s) use it rarely?

Scenario

Problem

Notes

1 Answers1

Linked