4

Environment - embedded device with linux kernel 2.6.18 Requirements - 3 threads (created from one process, lets say P1 created T1, T2, T3)

T1 is at linux priority 99 (The highest), T2 is at linux priority 50 (The mid), T3 is at linux priority 2 (the lowest). No nice values is set explicitly for any of the threads.

Both T1, and T3 increments a variable once per second. T1 prints both variables once per 5 seconds. This goes smooth. [Problematic place] When T2 enters into an infinite loop "for(;;);", there after T1's count is increasing properly, but T3's count is not at all increasing. Meaning which T3 has never got time to run in CPU.

All this time I was thinking CFS of linux guarantees all priorities will get its appropriate share (based on weightage). But this proves that any thread which goes to hog CPU without sleeping, stopping all other lower priority threads from running.

Plz answer if anyone knows why CFS scheduler behaves in this way and if there is a way to correct this?

BaskarA
  • 75
  • 5
  • What SCHED policy you are using ? – Maquefel Jan 21 '16 at 06:50
  • Ours is pre-compiled kernel from provider, and most of the terminal commands will not work for us. Even I couldn't see /proc//sched under process tree. – BaskarA Jan 21 '16 at 08:56
  • Are you sure your code is not source of problem? Also ask your provider for sources and config. – Maquefel Jan 21 '16 at 10:15
  • No, source code is plain code like explained above. May be the issue is that my kernel isn't using CFS at all, rather probably a RR is used. I'll try getting that config, but that's very unlikely to get to know via reporting ladder! – BaskarA Jan 21 '16 at 11:23
  • 1
    I am not cleanly undestand what you have. According to [man sched(7)](http://man7.org/linux/man-pages/man7/sched.7.html), `For threads scheduled under one of the normal scheduling policies (SCHED_OTHER, SCHED_IDLE, SCHED_BATCH), sched_priority is not used in scheduling decisions (it must be specified as 0).` So you either have CFS, which is replacement for SCHED_OTHER, so doesn't use priorities. Or you have priorities and real time-like policy. In the latter case the behaviour observed by you is fine. – Tsyvarev Jan 21 '16 at 13:48
  • 1
    To expand on what @Tsyvarev said, the behavior you are seeing is *expected* for `SCHED_RR` and `SCHED_FIFO` (real-time) scheduling policies. Highest priority wins. If your priority 99 thread never blocks, the lower priority threads will never run at all, nor will any other process (on this cpu). Even `SCHED_RR` (round-robin) is only round-robin among threads at the same priority. This is why @Maquefel asked about your scheduling policy. You don't get `SCHED_FIFO` / `SCHED_RR` unless you asked for it, and you have to be super-user to get it at all. – Gil Hamilton Jan 21 '16 at 15:15
  • BTW, according to https://en.wikipedia.org/wiki/Completely_Fair_Scheduler CFS was not merged until 2.6.23 so it's likely that you don't have it anyway. But the `SCHED_FIFO`/`SCHED_RR` behavior has been there since 2.4 days. – Gil Hamilton Jan 21 '16 at 15:17

1 Answers1

1

The realtime scheduling classes always pre-empt any lower scheduling classes. That is, a thread with SCHED_RR, if it is ready to run, will always pre-empt an thread with SCHED_OTHER. These classes should only be used to perform (usually short), urgent tasks which are required to fulfil the needs of other threads, the needs of hardware (such as reading from a serial port or network card buffer), or for security purposes (like writing auditing or logging entries, or committing database transactions). For example, user mode device drivers may use these priorities, as they must finish their work in order for other threads to run.

Similarly within SCHED_RR a higher priority process will always run if it is ready, which explains what you are seeing.

The key is this: The setting is about priority access to the CPU it is not about sharing access to the CPU. Higher priority wins, always. That is what priority means.

(To prevent pathological cases, realtime processes are limited by default to using 95% of the CPU time. This should never happen in a healthy system.)

If you simply want your threads to have a greater share of the general resources, you should use SCHED_OTHER and set your nice(2) value to a negative number using nice(2) or setpriority(2).

nice(2) is about sharing the CPU because it is nice to share.

Ben
  • 34,935
  • 6
  • 74
  • 113
  • According to this post [https://www.linkedin.com/pulse/20140629145049-21586023-understanding-linux-scheduling, even SCHED_RR will have a limit, after which SCHED_OTHER is allowed to run. That seems to go againts your "higher priority wins, always" sentence. Can you help explain a bit? I'm a newb in Linux scheduling. – Ryuu Sep 25 '17 at 13:47
  • @ryuu It appears you are correct. Doesn't really change the answer much for this particular question though. – Ben Sep 25 '17 at 14:11
  • @Ryuu maximum time for real-time is typically 95% as mentioned above. Specific value is exposed in /proc/sys/kernel/sched_rt_runtime_us – user228505 Mar 14 '18 at 15:42