4

My ultimate goal is to do some testing on multithreading programs on POSIX systems (not only Linux); and I hope to diversify the interleavings by intervening the schedules.


UPDATE: an example to explain my goal

For example, suppose the program will eventually run two threads, T1, T2. At the function level (or other levels such as instruction level, basic block level), suppose they have the following traces:

T1: A1 -> B1 -> C1
T2: A2 -> B2 -> C2

During the actual execution on one machine, there might be only a few interleavings. e.g., in an extreme case, no matter how many times the program execute, there is only one interleaving A1 -> B1 -> A2 -> B2 -> C2 -> C1 on one machine; but may be it's possible another interleaving A1 -> A2 -> B1 -> B2 -> C2 -> C1 happens on another machine, while this interleaving introduces a bug. Therefore I hope to diversify the interleaving by adding some scheduling controlling POSIX function calls (we have an automatic way to adjust the scheduling policy and priority dynamically), so that after running several versions of these schedule-adjusted programs it will at least provide more interleavings (but I don't care about the exact interleaving). For example, with one adjustment, it may execute trace A1 -> A2 -> B1 -> B2 -> C2 -> C1 with a higher chance; with another adjustment, it may execute trace A1 -> B1 -> A2 -> B2 -> C2 -> C1 or A1 -> B1 -> C1 -> A2 -> B2 -> C2 or other traces with a higher chance. In this sense, we increase the testing coverage.


Presently I'm thinking about instrumenting some code to adjust with different scheduling policies and priorities. And I came across SCHED_OTHER and SCHED_RR.

By default programs run with SCHED_OTHER. However it seems that POSIX standard hasn't detailed the requirements (see here although I'm not sure whether this is the latest standard). Specially, it even does not say whether it is round-robin. And by reading Linux sched man page and Red Hat documentation, it says the static priority is fixed 0 and the dynamic priority is

based on the nice value and is increased for each time quantum the thread is ready to run, but denied to run by the scheduler

To me, this seems to say setting nice value still cannot control the priority like what SCHED_RR does.
This other major issue is that setting nice value per thread may not be possible for other POSIX systems other than Linux

According to POSIX.1, the nice value is a per-process attribute; that is, the threads in a process should share a nice value. However, on Linux, the nice value is a per-thread attribute: different threads in the same process may have different nice values.

Therefore, I'm thinking whether it's possible to use SCHED_RR (SCHED_FIFO is out of my considerations since it is not time quantum based thus will greatly change the schedule) to simulate SCHED_OTHER's behavior, since most of the programs I would like to instrument use the default SCHED_OTHER policy. I have come up with some naive ideas:

  1. Since unprivileged SCHED_OTHER programs have nice value between 0 and 19, I can ensure the priority for the SCHED_RR instrumented programs to be within this range; if this still has changed the behaviors greatly, I can narrow down this range more, e.g., 0~5.
  2. I will make sure all the threads I would like to investigate to have the same SCHED_RR.
  3. I can make the time slice/quantum to be equal or similar to the value that is used by SCHED_OTHER (e.g., on Linux, by changing values inside /proc/sys/kernel/sched_rr_timeslice_ms, see this Q&A) (again, I don't know to set the time quantum for SCHED_OTHER).

So may I know:

  1. Is this idea reasonable for POSIX?
  2. Or if this is not general to POSIX, may I know whether it's correct for Linux only?
  3. Or if using SCHED_RR still changes behaviors greatly from the default SCHED_OTHER on Linux, what are possible ways to do more interventions on multithreading scheduling on Linux?

Other relevant links:

Hongxu Chen
  • 5,240
  • 2
  • 45
  • 85
  • At least in Linux? FIFO and RR are realtime, RR is pure preemptive round robin with fixed priority and a more priority task is always scheduled before any less prioritive task. OTHER is the traditional multi-level-feedback RR, means that task that have not been scheduled for a time have an increasing priority, this is a pure overall timesharing system. Now I am not sure to understand what you want to do, at least it seems very difficult to achieve (trying to enforce various interleaving). – Jean-Baptiste Yunès Dec 21 '18 at 14:28
  • Yes, at least for Linux. What I'm going to do is to intervene the scheduling for the threads so that I can *test* the multithreading program more effectively. Will update the question a bit. – Hongxu Chen Dec 21 '18 at 15:44
  • updated with an example to explain my purpose. – Hongxu Chen Dec 21 '18 at 16:10
  • Testing multithreaded programs is a real challenge. You may either run multiple times under different conditions (that's hat's is usually done), add some small random-timed code everywhere to slow down a little bit your critical code (that is also very often done), or you may try to change the scheduling (there exists a way to do it in realtime under linux, can't remember exactly something like Coccinnelle from Gilles Muller ?) – Jean-Baptiste Yunès Dec 24 '18 at 07:41
  • Thank you Jean! will follow your advice and do some searching – Hongxu Chen Dec 25 '18 at 08:14

0 Answers0