I have a program that needs to execute with 100% performance but I see that it is sometimes paused for more than 20 uSec. I've struggled with this for a while and can't find the reason/explanation.
So my question is: Why is my program "paused"/"stalled" for 20 uSec every now and then?
To investigate this I wrote the following small program:
#include <string.h>
#include <iostream>
#include <signal.h>
using namespace std;
unsigned long long get_time_in_ns(void)
{
struct timespec tmp;
if (clock_gettime(CLOCK_MONOTONIC, &tmp) == 0)
{
return tmp.tv_sec * 1000000000 + tmp.tv_nsec;
}
else
{
exit(0);
}
}
bool go_on = true;
static void Sig(int sig)
{
(void)sig;
go_on = false;
}
int main()
{
unsigned long long t1=0;
unsigned long long t2=0;
unsigned long long t3=0;
unsigned long long t4=0;
unsigned long long t5=0;
unsigned long long t2saved=0;
unsigned long long t3saved=0;
unsigned long long t4saved=0;
unsigned long long t5saved=0;
struct sigaction sig;
memset(&sig, 0, sizeof(sig));
sig.sa_handler = Sig;
if (sigaction(SIGINT, &sig, 0) < 0)
{
cout << "sigaction failed" << endl;
return 0;
}
while (go_on)
{
t1 = get_time_in_ns();
t2 = get_time_in_ns();
t3 = get_time_in_ns();
t4 = get_time_in_ns();
t5 = get_time_in_ns();
if ((t2-t1)>t2saved) t2saved = t2-t1;
if ((t3-t2)>t3saved) t3saved = t3-t2;
if ((t4-t3)>t4saved) t4saved = t4-t3;
if ((t5-t4)>t5saved) t5saved = t5-t4;
cout <<
t1 << " " <<
t2-t1 << " " <<
t3-t2 << " " <<
t4-t3 << " " <<
t5-t4 << " " <<
t2saved << " " <<
t3saved << " " <<
t4saved << " " <<
t5saved << endl;
}
cout << endl << "Closing..." << endl;
return 0;
}
The program simply test how long time it takes to call the function "get_time_in_ns". The program does this 5 times in a row. The program also tracks the longest time measured.
Normally it takes 30 ns to call the function but sometimes it takes as long as 20000 ns. Which I don't understand.
A little part of the program output is:
8909078678739 37 29 28 28 17334 17164 17458 18083
8909078680355 36 30 29 28 17334 17164 17458 18083
8909078681947 38 28 28 27 17334 17164 17458 18083
8909078683521 37 29 28 27 17334 17164 17458 18083
8909078685096 39 27 28 29 17334 17164 17458 18083
8909078686665 37 29 28 28 17334 17164 17458 18083
8909078688256 37 29 28 28 17334 17164 17458 18083
8909078689827 37 27 28 28 17334 17164 17458 18083
The output shows that normal call time is approx. 30ns (column 2 to 5) but the largest time is nearly 20000ns (column 6 to 9).
I start the program like this:
chrt -f 99 nice -n -20 myprogram
Any ideas why the call sometimes takes 20000ns when it normally takes 30ns?
The program is executed on a dual Xeon (8 cores each) machine.
I connect using SSH.
top shows:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8107 root rt -20 16788 1448 1292 S 3.0 0.0 0:00.88 myprogram
2327 root 20 0 69848 7552 5056 S 1.3 0.0 0:37.07 sshd