My question is very similar to this one here. But since it was asked 8 years ago, I was wondering if there are any better ways. Also I am new to linux programming, so please be gentle!
My scenario is as below -
The operations performed are -
- Process 1 and 2 are created with 3 threads. Threads in Process 2 are put to sleep immediately.
- Process 1 threads are mapped to Core 1, Process 2 threads are mapped to Core 2.
- A Thread in Process 1 enqueues a message in shared memory queue and notifies (currently via futexes) Process 2.
- A thread in Process 2 wakes up to dequeue the message.
As is obvious from the use of futexs, I am concentrating on thread wakeup latency. My target is to get time taken by steps 3 and 4 to be < 3 microseconds.
Following are results which I have been able to get for steps 3 & 4, by using different IPC synchronization mechanisms -
Posix mutexs + condition variables - Avg 14.14 microseconds
Pipes - Avg 28.54 microseconds
Futexs - Avg 12.48 microseconds
I know these results are very subjective and will vary from machine to machine. What I need is suggestions on how I can make steps 3 and 4 even faster?
So far I have looked into -
- futexes
- mutexes
- conditions
- semaphores
- spinlocks (cannot use them because Threads in Process 2 are already sleeping)
Please help!!
Assume that the architecture cannot be changed, i.e., there have to be two process with threads. Everything else can be changed.