Unexpected task switch on Linux despite of real time and nice -20

Question

I have a program that needs to execute with 100% performance but I see that it is sometimes paused for more than 20 uSec. I've struggled with this for a while and can't find the reason/explanation.

So my question is: Why is my program "paused"/"stalled" for 20 uSec every now and then?

To investigate this I wrote the following small program:

#include <string.h>
#include <iostream>
#include <signal.h>

using namespace std;

unsigned long long get_time_in_ns(void)
{
  struct timespec tmp;
  if (clock_gettime(CLOCK_MONOTONIC, &tmp) == 0)
  {
    return tmp.tv_sec * 1000000000 + tmp.tv_nsec;
  }
  else
  {
    exit(0);
  }
}

bool go_on = true;

static void Sig(int sig)
{
  (void)sig;
  go_on = false;
}

int main()
{
  unsigned long long t1=0;
  unsigned long long t2=0;
  unsigned long long t3=0;
  unsigned long long t4=0;
  unsigned long long t5=0;
  unsigned long long t2saved=0;
  unsigned long long t3saved=0;
  unsigned long long t4saved=0;
  unsigned long long t5saved=0;

  struct sigaction sig;
  memset(&sig, 0, sizeof(sig));
  sig.sa_handler = Sig;
  if (sigaction(SIGINT, &sig, 0) < 0)
  {
    cout << "sigaction failed" << endl;
    return 0;
  }

  while (go_on)
  {
    t1 = get_time_in_ns();
    t2 = get_time_in_ns();
    t3 = get_time_in_ns();
    t4 = get_time_in_ns();
    t5 = get_time_in_ns();
    if ((t2-t1)>t2saved) t2saved = t2-t1;
    if ((t3-t2)>t3saved) t3saved = t3-t2;
    if ((t4-t3)>t4saved) t4saved = t4-t3;
    if ((t5-t4)>t5saved) t5saved = t5-t4;
    cout <<
        t1 << " " <<
        t2-t1 << " " <<
        t3-t2 << " " <<
        t4-t3 << " " <<
        t5-t4 << " " <<
        t2saved << " " <<
        t3saved << " " <<
        t4saved << " " <<
        t5saved << endl;
  }

  cout << endl << "Closing..." << endl;

  return 0;
}

The program simply test how long time it takes to call the function "get_time_in_ns". The program does this 5 times in a row. The program also tracks the longest time measured.

Normally it takes 30 ns to call the function but sometimes it takes as long as 20000 ns. Which I don't understand.

A little part of the program output is:

8909078678739 37 29 28 28 17334 17164 17458 18083
8909078680355 36 30 29 28 17334 17164 17458 18083
8909078681947 38 28 28 27 17334 17164 17458 18083
8909078683521 37 29 28 27 17334 17164 17458 18083
8909078685096 39 27 28 29 17334 17164 17458 18083
8909078686665 37 29 28 28 17334 17164 17458 18083
8909078688256 37 29 28 28 17334 17164 17458 18083
8909078689827 37 27 28 28 17334 17164 17458 18083

The output shows that normal call time is approx. 30ns (column 2 to 5) but the largest time is nearly 20000ns (column 6 to 9).

I start the program like this:

chrt -f 99 nice -n -20 myprogram

Any ideas why the call sometimes takes 20000ns when it normally takes 30ns?

The program is executed on a dual Xeon (8 cores each) machine.

I connect using SSH.

top shows:

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
8107 root      rt -20   16788   1448   1292 S   3.0  0.0   0:00.88 myprogram
2327 root      20   0   69848   7552   5056 S   1.3  0.0   0:37.07 sshd

Linux is not an RTOS, therefore you must expect involuntary task switches. — Daniel Kamil Kozar, May 14 '15 at 17:07
@DanielKamilKozar - I know :) but still I can't understand why my program is "paused". There are at least 7 idle cores on both cpus to do the OS stuff. — Support Ukraine, May 14 '15 at 17:15
Your program will be considered stalled when doing I/O. Judging by your number of calls to `cout <<`, it seems justified. — Daniel Kamil Kozar, May 14 '15 at 17:16
@DanielKamilKozar - the cout is not done during the time measurement. It is done after. So I don't think that the cout will impact the measurements. — Support Ukraine, May 14 '15 at 17:17
To the "down voter" : It would be helpful if you explained the reason. Is the question off-topic? Stupid? Badly written? Or? — Support Ukraine, May 14 '15 at 17:36

score 2 · Accepted Answer · edited May 23 '17 at 11:51

Even the lowest value of niceness is not a real time priority — it is still in policy SCHED_OTHER, which is a round-robin time-sharing policy. You need to switch to a real time scheduling policy with sched_setscheduler(), either SCHED_FIFO or SCHED_RR as required.

Note that that will still not give you absolute 100% CPU if it isn't the only task running. If you run the task without interruption, Linux will still grant a few percent of the CPU time to non-real time tasks so that a runaway RT task will not effectively hang the machine. Of course, a real time task needing 100% CPU time is unlikely to perform correctly.

Edit: Given that the process already runs with a RT scheduler (nice values are only relevant to SCHED_OTHER, so it's pointless to set those in addition) as pointed out, the rest of my answer still applies as to how and why other tasks still are being run (remember that there are also a number kernel tasks).

The only way better than this is probably dedicating one CPU core to the task to get the most out of it. Obviously this only works on multi-core CPUs. There is a question related to that here: Whole one core dedicated to single process

Thanks - regarding the part of your answer: I do actually set `SCHED_RR` using `chrt`. This is also shown by `top` as I copied to my Q. — Support Ukraine, May 14 '15 at 17:20
Thanks for the update. I didn't know that `nice` doesn't matter when using `SCHED_RR` - thanks. I'll study your link :) Also the word `interrupt`is a trigger but I have to admit I'm not really sure how it works on these multi-core CPUs. Thanks for your answer. — Support Ukraine, May 14 '15 at 17:35

Unexpected task switch on Linux despite of real time and nice -20

1 Answers1