3

Just to head off any comments to the effect of "why do you need to know this??": This is just a puzzle I was curious about, not something I need to do for any practical reason.

Given a typical POSIX system[1], how would you design an experiment to determine the scheduling quantum[2] of a CPU-bound process?

[1]: but NOT one that lets you query for this information through a syscall or /proc interface

[2]: "Scheduling quantum" is defined as the amount of time a process will run on the CPU without blocking or yielding before its scheduled time is over and the OS lets a different process run.

Brennan Vincent
  • 10,736
  • 9
  • 32
  • 54
  • 1
    Note that a higher-priority process will generally run as soon as it wants to (otherwise there would be no point to having a higher priority). Perhaps in your last paragraph you meant to say "... before its scheduled time is over and the OS lets another same-priority process run." – Jeremy Friesner May 29 '16 at 06:10
  • @JeremyFriesner I thought in typical systems processes would not be pre-empted until their time-slice expired. Am I mistaken? How does it work in, say, Linux or FreeBSD? Regardless, I have edited the question. – Brennan Vincent May 29 '16 at 06:30
  • Typically a process will keep running until either (a) its quantum is expired (and another process with the same priority is ready to run), or (b) another process with a higher priority becomes ready to run. (The alternative, in which a ready-to-run higher-priority process is not running because a lower process is running instead, is known as a priority inversion and it's something scheduler designers try to avoid) – Jeremy Friesner May 29 '16 at 14:05
  • @JeremyFriesner so on a 1-core system, if a high-priority CPU-bound thread spins forever, nothing else can ever run, unless the scheduler figures out it needs to lower the priority? This seems wrong, but I'm not an expert. – Brennan Vincent May 30 '16 at 05:01
  • 1
    In a system with static priorities, that is exactly what can happen. This is common e.g. in real-time systems, if a high-priority thread never gives up the CPU then you may have to reboot the computer in order to regain control of it. Of course having a buggy program render your computer uncontrollable isn't a very desirable behavior, which is why some non-real-time OS's (notably Unix/Linux) dynamically change a process's priority if it fails to voluntarily give up the CPU (e.g. by blocking) for a long time. (see: http://www.informit.com/articles/article.aspx?p=101760 ) – Jeremy Friesner May 30 '16 at 05:21

2 Answers2

3

I'm not sure how accurate it would be, but this might work:

  1. Make sure your computer is idle (or as idle as you can make it)
  2. Spawn off 2N threads (where N is the number of cores in your computer). All of these threads should be set to run at the same priority as each other.
  3. Each of the threads should be running an infinite loop where it does nothing but repeatedly retrieve the current monotonically-increasing-wallclock time using a high-resolution timer (e.g. calling std::chrono::steady_clock::now() or similar).
  4. At each iteration of the the loop, each thread should check the resulting time value for "sudden gaps", i.e. where the clock time jumps from (t) to (t+n milliseconds, with n being greater than the usual delta value). Those gaps most likely indicate a time period when the thread got kicked off of the CPU so that another one of the threads could run.
  5. At some point, compute the average of the sizes of all of those gaps, and that is your estimate of the scheduler's quantum size.

Note that this assumes that your clock's resolution is greater than the scheduler's quantum size; if it's not (e.g. if you trying to use a clock with 10mS resolution to measure a 5mS quantum length), then measuring the quantum length would be difficult AFAICT.

Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234
  • 2
    This is probably close, but I think it could be thrown off in systems where std::chrono::steady_clock::now() requires a syscall (so context switches and scheduling would be happening on every time through the loop, not just after the timeslice expires) – Brennan Vincent May 29 '16 at 06:37
  • A syscall does not cause a switch to another task. That would be a extremely expensive system design. – usr May 29 '16 at 22:29
  • It might require a context switch from user-mode to kernel-mode (and back), though.... http://stackoverflow.com/questions/9238326/system-call-and-context-switch – Jeremy Friesner May 30 '16 at 02:26
  • @usr it often does -- especially if the call can't be completed immediately (think `read` etc.) – Brennan Vincent May 30 '16 at 04:40
  • Clock calls are not blocking syscalls. – usr May 30 '16 at 09:57
2

I think you could get the answer through a statistical analysis of enough runs of the following system:

  • Run one thread per processor that clears a terminate flag, then runs a loop for a fixed number of iterations or until a terminate flag is set, whichever comes first. These threads record whether they terminated due to running all the iterations, or due to the terminate flag being set.

  • At the same time, run an additional thread that sets the terminate flag.

Do this at a variety of number of iterations in the loop.

If the loop completes within the thread time slice, it will complete all iterations. If it does not complete within the thread time slice, the termination thread will have a chance to interrupt one of the looping threads.

Now, the termination thread will sometimes be scheduled first, and there may also be other threads running that complicate the behavior, so you may need to run this a lot of times on multiprocessor systems and analyze the results statistically. You'll also need to account for things like thread startup time and memory access time, since there will presumably be a memory barrier on each iteration through the loop to check the flag.

With enough repetitions at enough different loop iteration limits, though, this should give you the number of times you can iterate through the loop in one time slice. You can then run a large number of iterations on an unloaded system to get the length of time it takes for each iteration, and then calculate the wall clock time for each time slice.

Warren Dew
  • 8,790
  • 3
  • 30
  • 44
  • On a multiprocessor system, won't thread 2 /always/ run immediately on an available processor? – Brennan Vincent May 29 '16 at 20:29
  • @BrennanVincent Good point - if there are no other threads running, yes. I've adjusted the answer to accomodate that. Obviously the test design could be further improved, but the I think the basic idea I'm providing is sound. – Warren Dew May 29 '16 at 21:53