pthread signaling without kernel call

Question

I am running a few threads using pthreads on a real time linux (red hawk) in C++. All the threads run on a fixed frequency loop and one of the threads will poll the CPU clock and alert the other two threads that the next loop has started (by the end of the loop we can safely assume that the other loops have finished their task and are waiting for the next loop. My goal is to reduce latency where possible, and I have the ability to let threads take 100% of the CPU they are on (and guarantee they are the only thing running on that CPU due to the red hawk enhancements).

My idea to do this was to have the timing thread poll the cpu tick count until it reaches > X, then increment a 64 or 32 bit counter without asking for a mutex. The other two loops will poll this counter and wait for it to increase, also without asking for a mutex. How I see it no mutex is needed since the first thread can increment the counter atomically since it is the only thing writing to it. The other two threads can read from it without fear because a 32 or 64 bit number can be written to memory without it ever being a partial state (I think).

I realize that all my threads will be polling something and therefore running at 100% all the time, and I could reduce that by using the pthreads signaling, but I believe that the latency there is more than I want. I also know a mutex takes about a couple tens of nanoseconds, so I could probably use them without seeing the latency, but I don't see why it is needed when I have one thread increment a counter and the other two polling it.

@John, I did and it seems to work, but I find in cases like this it's difficult to spot issues and those issues can occur randomly, so I was hoping someone with a good understanding of the low lever workings going on here could chime in. — Michael, Sep 08 '14 at 16:47

score 2 · Accepted Answer · answered Sep 08 '14 at 13:11

2

You need to tell the compiler that your counter is a synchronization variable. You do that by declaring your counter std::atomic, and then using one of the built in operators (either fetch_add() or operator++() for the increment and load() for the reading threads.) See http://en.cppreference.com/w/cpp/atomic/atomic.

If you don't declare your counter atomic then you will have a data-race, your program has no defined semantics and the compiler is permitted to (and probably will) move code around with respect to the counter test (which will probably lead to results you don't expect.)

You need to use c++11 to get std::atomic. In most versions of g++ you do that with the --std=c++0x flag. The most recent versions of g++ require the --std=c++11 flag instead.

answered Sep 08 '14 at 13:11

Wandering Logic

3,323
1
20
25

And if you can't use C++11, you can probably declare it as `volatile`. Someone now will show up and say this isn't guaranteed to work and whatnot, but people who can't use C++11 may find it useful. – John Zwinck Sep 09 '14 at 00:38
1

I'll be the one who will show up and say that not only is it not guaranteed to work, it is _worse_ than doing nothing. https://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/. Volatile disables the _wrong_ optimizations. Without C++11 the only way to do it is to either use assembly, or use the non-portable compiler intrinsics for memory fences. gcc: https://gcc.gnu.org/onlinedocs/gcc-4.4.5/gcc/Atomic-Builtins.html#Atomic-Builtins, or something like https://packages.debian.org/sid/libatomic-ops-dev. – Wandering Logic Sep 09 '14 at 00:50
And here's http://stackoverflow.com/questions/6397662/if-volatile-is-useless-for-threading-why-do-atomic-operations-require-pointers. The question includes links to two other web resources that explain why volatile won't solve any useful problem in multithreaded synchronization. – Wandering Logic Sep 09 '14 at 00:52
An even better link to explain why volatile is insufficient: http://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming – Wandering Logic Sep 09 '14 at 13:22
So by using operator++ and load() does it do one or more of the following: 1. ensures the increment will be seen by a different loop ASAP, 2. ensure that all operations performed before the increment in the write thread are seen after the read in the following thread? – Michael Sep 11 '14 at 01:02
Yes, both. 1. It ensures the increment will be seen by all other threads ASAP. 2. It also guarantees that the compiler and cache system will make sure that all operations before the increment appear to happen before the increment to all threads. 3. It also guarantees that the ++ operation will be done with a truly atomic instruction, which there is no other (portable) way to guarantee in C or C++. (I know this last doesn't matter to you in this case, but if someone comes along in a few years and adds a second producer it very much will matter.) – Wandering Logic Sep 11 '14 at 01:13

score 0 · Answer 2 · answered Sep 08 '14 at 13:07

0

Since there will be shared variables, one thread modifying (incrementing) and others accessing, best would be to wrap between pthread_mutex_lock and pthread_mutex_unlock to ensure mutual exclusion

answered Sep 08 '14 at 13:07

Dr. Debasish Jana

6,980
4
30
69

pthread signaling without kernel call

2 Answers2