maximum number of threads - How to determine if your C++ pthreads are running in parallel?

Question

I am not knowledgeable enough in multi-thread programming. I my C++ program I make 10 pthreads which each are responsible to process a part of realtime data streamed to the system. Eventually inside the main(), the processed data from all threads are merged. To my knowledge the exact number of threads that can be run in parallel is determined by the Cores per socket x Threads per socket x sockets which in my case is 8 (4 x 2 x 1). So is it the case that in my application with 10 threads no palatalization is happening and swapping between different threads is what the scheduler does? if not is there any way to determine this? or at least can I know how many milliseconds does this swapping takes? Since my application is dealing with real time data, its crucial to make sure all the threads are running in parallel.

"Swapping" between different threads of execution have been happening since multitasking was invented 50 or 60 years ago. If your system supports 8 simultaneous threads then such swapping needs to happen. And that swapping also includes all the other programs and their processes and threads running on your system. — Some programmer dude, May 02 '18 at 13:47
[`std::thread::hardware_concurency`](http://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency) may be of interest to you. — François Andrieux, May 02 '18 at 13:49

score 3 · Accepted Answer · answered May 02 '18 at 13:55

3

"To my knowledge the exact number of threads ... "

of a sort - there's also hyper-threading that scews things; and there's also the fact that C++ can be run on non-PC's. (ie embedded devices) which will have a static number.

"is what the scheduler does"

Yes. But not just your threads; all the threads, of the OS, and other applications also get controlled by this.

or at least can I know how many milliseconds does this swapping takes

No; and nor does it matter since you can't do anything about it. The best you can do is try to track how much CPU time you've had on your thread, work out the total CPU time available to a single thread; and then you can work out what % of the time you've had; but again, you shouldn't need to do this ever because it likely means your program is worrying about things outside it's control. More meaningful would be to monitor the CPU usage of the machine.

Since my application is dealing with real time data, its crucial to make sure all the threads are running in parallel.

It's not up to you about how your threads run; it's up to the OS. There are other more important tasks on your computer than your program. For example the OS needs processing time to handle any requests for memory made by applications.

Some OS's are actually able to do do this in a more specialised way; Qnix for example is a real time OS; but this comes at other costs.

One thing you can be sure though; is that if you have a maximum number of possible threads running as 8; and you create 10; you will not have them all running in parallel. You may wish to look into some libraries that will help ensure that the threads you create are always busy; boost::asio for example

answered May 02 '18 at 13:55

UKMonkey

6,941
3
21
30

Thanks for your complete answer! So following what you said, even if I have 8 threads, they are not going to be parallel anyways, since the OS is not handling only my program... – Ali Nouri May 02 '18 at 14:09
They will likely be in parallel at some point; and also they will likely not be in parallel also at some point. This is why thread syncronisation is so important. – UKMonkey May 02 '18 at 14:15
Is it a correct way to use structures such as timeval in C++ and calculate the time inside each thread? Basically what each thread does in my application is to create and open a UDP socket and listen on a specific port inside a while loop for realtime data coming from other linux machines. – Ali Nouri May 02 '18 at 14:48
@AliNouri https://stackoverflow.com/questions/44916362/how-can-i-measure-cpu-time-of-a-specific-set-of-threads just remember: 1. your thread can be context switched at any time. including just before/after getting any time information. 2. if a thread is blocked (eg in a `read()`); then the OS will switch it out. – UKMonkey May 02 '18 at 14:56
on linux; you can change the "niceness" of a process and it's threads, (`man nice`) which will encourage the OS to give your threads processing time over others. – UKMonkey May 02 '18 at 14:58
I understand the point that this swamping can happen before or after getting the time information, however if each thread is changing the value of an integer for example, then capturing the time between these alternation of the integer value inside the main() can give me informations regarding the time took for each thread. I already know that it takes 3ms for each data to be streamed to the thread and amazingly when I capture this time it is around 3~4 ms.. which is inconsistent with the fact that swapping can take orders of 10~20ms if its happening... – Ali Nouri May 02 '18 at 15:11
@AliNouri https://stackoverflow.com/questions/21887797/what-is-the-overhead-of-a-context-switch 10 ms is at least 3 orders of magnitude out. (roughly 10us for a switch) – UKMonkey May 02 '18 at 15:32
Just to ask the last question to clarify everything for myself. So if some pthreads include I/O such as streaming data via UDP or streaming images from camera, the in this case is not worth to spend the thread time waiting for response. And here is exactly where swapping happens. So the questions are 1) does the C++ program handle this itself ? second, at the end this is somehow designed in a way (all the swapping) that the result in term of timing is like if all 10 threads where actually running in parallel? (in case we have I/O inside threads) – Ali Nouri May 02 '18 at 18:23
For real-time applications, sometimes you need a true hard real-time OS. That will give you much more control than you get with a best-effort Unix-type kernel. Of course, Linux has some real-time elements like the SCHED_DEADLINE scheduler and the rt patches. Those are sometimes all you need. – Erik Alapää Jul 01 '19 at 20:03

score -1 · Answer 2 · answered May 02 '18 at 14:16

I can not add comments yet so part as answer.

or at least can I know how many milliseconds does this swapping takes

On Windows this is default 10 - 20 ms With a little trick you can change this to 1 - 2 ms

#include <Mmsystem.h>
#pragma comment(lib,"winmm.lib")
timeBeginPeriod(1); // set system clock to 1ms from default 10ms

If you want to add another dirty trick to give your process more power. You need to add the 2 lines below in the main() process and you also need to do this 2nd line for every thread.

SetPriorityClass(GetCurrentProcess(),REALTIME_PRIORITY_CLASS);
SetThreadPriority(GetCurrentThread(),THREAD_PRIORITY_NORMAL);

Be warned !!!! You will make windows unresponsive if you do this with more threads than there are processing cores. So you need to give the system some time in the threads.

On Linux the default is already 1 ms (dynamic) but you can change between different scheduler.

timeBeginPeriod changes the resolution of a clock; it doesn't impact how long it takes to swap; which is certainly nowhere near 1ms — UKMonkey, May 02 '18 at 14:49
it changes how long the interruption takes to be possible scheduled again. — Benno Geels, May 02 '18 at 19:41

maximum number of threads - How to determine if your C++ pthreads are running in parallel?

2 Answers2