Linux not respecting SCHED_FIFO priority ? ( normal or GDB execution )

Question

TL;DR

On multiprocessors/multicores engines, more than one RT SCHED_FIFO threads may be scheduled on more than one execution unit. So thread with priority 60 and thread with priority 40 may run simultaneously on 2 different cores.

This may be counter-intuitive, especially when simulating embedded systems that (often as today) run on single core processors and rely on strict priority execution.

See my other answer in this post for summary

Original problem description

I have difficulties even with very simple code to make Linux respect the priority of my threads with scheduling policy SCHED_FIFO.

See MCVE at the end of the question.
See modified MCVE in answer

This situation comes from the need to simulate an embedded code under a Linux PC in order to perform integration tests

The main thread with fifo priority 10 will launch the thread divisor and ratio.

divisor thread should get priority 2 so that the ratio thread with priority 1 will not evaluate a/b before b gets a decent value ( this is a completely hypothetical scenario only for the MCVE, not a real life case with semaphores or condition variables ).

Potential Prerequiste: You need to be root or BETTER to setcap the program so that to can change the scheduling policy and priority

sudo setcap cap_sys_nice+ep main

johndoe@VirtualBox:~/Code/gdb_sched_fifo$ getcap main
main = cap_sys_nice+ep

First experiments were done under Virtualbox environment with 2 vCPUs(gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git) were code behaviour was almost OK under normal execution but NOK under GDB.
Other experiments on Native Ubuntu 20.04 show very frequent NOK behaviours even in normal execution with I3-1005 2C/4T (gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1 )

Compile basically:

johndoe@VirtualBox:~/Code/gdb_sched_fifo$ g++ main.cc -o main -pthread

Normal execution sometimes OK sometimes not if no root or no setcap

johndoe@VirtualBox:~/Code/gdb_sched_fifo$ ./main
Problem with setschedparam: Operation not permitted(1)  <<-- err msg if no root or setcap
Result: 0.333333 or Result: Inf                         <<-- 1/3 or div by 0

Normal execution OK (e.g with setcap )

johndoe@VirtualBox:~/Code/gdb_sched_fifo$ ./main
Result: 0.333333

Now if you want to debug this program you get again an the error message.

(gdb) run
Starting program: /home/johndoe/Code/gdb_sched_fifo/main 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7f929a6a9700 (LWP 2633)]
Problem with setschedparam: Operation not permitted(1)     <<--- ERROR MSG
Result: inf                                                <<--- DIV BY 0
[New Thread 0x7f9299ea8700 (LWP 2634)]
[Thread 0x7f929a6a9700 (LWP 2633) exited]
[Thread 0x7f9299ea8700 (LWP 2634) exited]
[Inferior 1 (process 2629) exited normally]

This is explained in this question gdb appears to ignore executable capabilities ( allmost all answers may be relevant ).

So in my case I did

sudo setcap cap_sys_nice+ep /usr/bin/gdb
create a ~/.gdbinit with set startup-with-shell off

And as a result I got:

(gdb) run
Starting program: /home/johndoe/Code/gdb_sched_fifo/main 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff6e85700 (LWP 2691)]
Result: inf                              <<-- NO ERR MSG but DIV BY 0 
[New Thread 0x7ffff6684700 (LWP 2692)]
[Thread 0x7ffff6e85700 (LWP 2691) exited]
[Thread 0x7ffff6684700 (LWP 2692) exited]
[Inferior 1 (process 2687) exited normally]
(gdb)

So conclusion and question

I thought the only problem came from GDB
Testing on another (non-virtual) target showed even worse results under normal execution

I saw other questions related to RT SCHED_FIFO not respected but I find that the answers have no or unclear conclusions. My MCVE is also much smaller with fewer potential side-effects

Linux SCHED_FIFO not respecting thread priorities

SCHED_FIFO higher priority thread is getting preempted by the SCHED_FIFO lower priority thread?

Comments brought some pieces of answer but I am still not convinced ... ( ... it should work like this )

The MCVE:

#include <iostream>
#include <thread>
#include <cstring>

double a = 1.0F;
double b = 0.0F;

void ratio(void)
{
    struct sched_param param;
    param.sched_priority = 1;
    int ret = pthread_setschedparam(pthread_self(),SCHED_FIFO,&param);
        if ( 0 != ret )
    std::cout << "Problem with setschedparam: " << std::strerror(errno) << '(' << errno << ')' << "\n" << std::flush;

    std::cout << "Result: " << a/b << "\n" << std::flush;
}

void divisor(void)
{
    struct sched_param param;
    param.sched_priority = 2;
    pthread_setschedparam(pthread_self(),SCHED_FIFO,&param);

    b = 3.0F;

    std::this_thread::sleep_for(std::chrono::milliseconds(2000u));
}


int main(int argc, char * argv[])
{
    struct sched_param param;
    param.sched_priority = 10;
    pthread_setschedparam(pthread_self(),SCHED_FIFO,&param);

    std::thread thr_ratio(ratio);
    std::thread thr_divisor(divisor);

    thr_ratio.join();
    thr_divisor.join();

    return 0;
}

score 0 · Accepted Answer · answered Sep 26 '20 at 03:01

0

There are a few things obviously wrong with your MCVE:

You have a data race on b, i.e. undefined behavior, so anything can happen.
You are expecting that the divisor thread will have finished pthread_setschedparam call before the ratio thread gets to computing the ratio.

But there is absolutely no guarantee that the first thread will not run to completion long before the second thread is even created.

Indeed that is what's likely happening under GDB: it must trap thread creation and destruction events in order to keep track of all the threads, and so thread creation under GDB is significantly slower than outside of it.

To fix the second problem, add a counting semaphore, and have both threads randevu after each executed the pthread_setschedparam call.

answered Sep 26 '20 at 03:01

Employed Russian

199,314
34
295
362

There is still someting I do not understand. The a/b operation should be performed only after the ratio thread priority is 1. At that time on the system there is potentially the main thread with priority 10 ( if not blocked on the join() ) and the divisor thread with priority either 10 ( before the pthread_setchedparam(...) call ) or 2 ( after the pthread_setchedparam(...) call ). So I would expect that once the ratio thread has a priority of 1, it would be blocked because there are threads on the system with higher priority ( and SCHED_FIFO policy ) – NGI Sep 26 '20 at 05:59
@NGI Did you configure your VBox with 1 CPU? If not, main thread and first (`ratio`) thread will run in parallel. – Employed Russian Sep 26 '20 at 06:52
OK I think I begin to understand ... . I will give it a look. I think there are 2 vCPUs. ( not on the same computer at the present time ). Anyway I find it strange. I would accept that 1 CPU runs a sched_fifo RT thread, 1 other CPU runs a sched_rr RT thread and the other CPUs normal sched other non-RT threads. I would accept that 2 sched_fifo RT threads with same priority would be executed simultaneously on 2 CPUs/Cores. But I find it a little bit strange that sched_fifo threads with different priorities are executed simultaneously. It seems contradictory. – NGI Sep 26 '20 at 07:19
Not executing them simultaneously would be against the defined behavior. If there are SCHED_FIFO 1, SCHED_FIFO 2, and SCHED_OTHER threads, and the scheduler runs the SCHED_FIFO 2 and SCHED_OTHER on two CPUs, then it is violating the requirement to run SCHED_FIFO 1 before SCHED_OTHER. – TrentP Sep 26 '20 at 09:43
@TrentP. Yes sorry you are right. Every SCHED_FIFO RT threads with any priority 1..99 must complete or get blocked before any other thread with policy SCHED_OTHER can run. But it does not change my interrogation. Can a SCHED_FIFO 1 thread run on another CPU while a SCHED_FIFO 2 is already running on a CPU ? If we took care to give a higher priority to a thread that's because we want it to execute before the others. I will expand my search on SO on this topic. – NGI Sep 26 '20 at 11:20
The kernel will try to run as many threads as it can. There is no rule that only one SCHED_FIFO thread at a time can run. Also consider that only *runnable* SCHED_FIFO threads can or must be run. If a thread is not runnable, because it is blocked on a page fault or system call, then it can't run. If a SCHED_FIFO 1 thread can't run then another thread will run. – TrentP Nov 04 '20 at 07:52

NGI · Answer 2 · 2020-09-30T06:54:57.483

I tried many solutions but never got 'No defect' code. See also my other answer in this post

The code with the best rate,but not perfect is the one below with the traditionnal pthread C language that allow to create the thread with the right attributes right from the start.

I am still astonished to see that I still get error even with this code (same as Question MCVE but with pure pthread... API ).

In order to stress the code I found the following sequence

$ seq 1000 | parallel ./main | grep inf
Result: inf
Result: inf
....

inf denoting the wrong division by 0 result. Defect is in my case around 10/1000.

Command like for i in {1..1000}; do ./main ; done | grep inf are longer

Threads are launched from higher priority to lower priority

So now the divisor thread

is created first
with higher RT priority (2 > 1 > main stay with SCHED_OTHER non RT scheduling).

So I wonder why I still get division by 0 ...

At last I tried to reduce the taskset. It runs OK when

$ taskset -pc 0 $$
pid 2414's current affinity list: 0,1
pid 2414's new affinity list: 0
$ for i in {1..1000}; do ./main_oss ; done   <<-- no need for parallel in this case
Result: 0.333333
Result: 0.333333
Result: 0.333333
Result: 0.333333
Result: 0.333333
...

but once there are more than 1 CPU the defect comes back

$ taskset -pc 0,1 $$
pid 2414's current affinity list: 0
pid 2414's new affinity list: 0,1
$ seq 1000 | parallel ./main_oss
Result: 0.333333          | <<-- display by group of 2
Result: 0.333333          |
Result: inf             |   <<--
Result: 0.333333        |
...

Why do we run lower priority RT SCHED_FIFO thread on another CPU when the thread belongs to the same parent process = ?

Unfortunately PTHREAD_SCOPE_PROCESS is not supported on Linux

#include <iostream>
#include <thread>
#include <cstring>
#include <pthread.h>

double a = 1.0F;
double b = 0.0F;

void * ratio(void*)
{
    std::cout << "Result: " << a/b << "\n" << std::flush;
    return nullptr;
}

void * divisor(void*)
{
    b = 3.0F;
    std::this_thread::sleep_for(std::chrono::milliseconds(500u));
    return nullptr;
}


int main(int agrc, char * argv[])
{
    struct sched_param param;

    pthread_t thr[2];
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setschedpolicy(&attr,SCHED_FIFO);
    pthread_attr_setinheritsched(&attr,PTHREAD_EXPLICIT_SCHED);

    param.sched_priority = 2;
    pthread_attr_setschedparam(&attr,&param);
    pthread_create(&thr[0],&attr,divisor,nullptr);

    param.sched_priority = 1;
    pthread_attr_setschedparam(&attr,&param);
    pthread_create(&thr[1],&attr,ratio,nullptr);  

    pthread_join(thr[0],nullptr);
    pthread_join(thr[1],nullptr);

    return 0;
}

Would you consider your problem solved or not? I was going to take a look at it. I also wanted to mention that under Linux, threads are indistinguishable from processes w.r.t to the Kernel, so it could well schedule them onto another CPU — Micrified, Sep 29 '20 at 21:18
@Micrified, I had still problem with the debugging ( the ability to set breakpoint when using exec-wrapper, ability to use IDE ... ). I will just update a new answer to gather this data) — NGI, Sep 30 '20 at 06:01
Okay. Well I've not tried using with a debugger, but what kind of outcome are you trying to get practically? I know this is a MCVE but is there some kind of real-world application/scenario you're having trouble with? — Micrified, Sep 30 '20 at 06:58
@Micrified. I added a second ["answer"](https://stackoverflow.com/a/64132315/3972710). Yes it has real-world usage when you want to simulate single-core embedded code on PC (the other questions on similar subject always mention this kind of case) — NGI, Sep 30 '20 at 07:03

NGI · Answer 3 · 2020-09-30T09:00:19.213

A new answer to gather the remaining problems I had for Debugging.

Answers like Setting application affinity in gdb / Markus Ahlberg or questions like gdb don't break when I use exec-wrapper script to exec my target binary gave a solution with the use of the GDB option exec-wrapper but then I was not (always) able to set breakpoints in my code (even trying my own wrapper)

I finally came back to this solution again Setting application affinity in gdb / Craig Scratchley

The initial problem

$ ./main
Result: inf

The solution for run-time

taskset -c 0 ./main
Result: 0.333333

But for debug

gdb -ex 'set exec-wrapper taskset -c 0' ./main
--> mixed result depending on conditions (native/virtualized ? Number of cores ? ) 
sometimes 0.333333 sometimes inf
--> problem to set breakpoints
--> still work to do for me to summarize this issue

or

taskset -c 0 gdb main
...
(gdb) r
...
Result: inf

and finally

taskset -c N chrt 99 gdb main <<-- where N is a core number (*)
...                           <<-- 99 denotes here "your higher prio in your system"
(gdb) r
...
Result: 0.333333

I wrote N above because if your program main sets it's affinity to processor M and you set gdb affinity to N, you may get trouble the same original problem
I wrote only chrt 99 for GDB even if I am interested in SCHED_FIFO and not SCHED_RR because I experienced gdb ( or IDE see below ) freezes if option -f ( for fifo ) was used. I suspect the roud robin mechanism is safer as a thread will always release at some point

And if you have an IDE (but do not know how to set gdb properly inside this IDE) I was able to do

taskset -c N chrt 99 code

Linux not respecting SCHED_FIFO priority ? ( normal or GDB execution )

3 Answers3

Linked