40

This pattern comes up a lot but I can't find a straight answer.

An non-critical, un-friendly program might do

while(True):
    # do some work

Using other technologies and platforms, if you want to allow this program to run hot (use as much CPU cycles as possible) but be polite - allow other programs who are running hot to effectively slow me down, you'd frequently write:

while(True):
    #do some work
    time.sleep(0)

I've read conflicting information about whether the latter approach would do what I'd hope on python, running on a linux box. Does it cause a context switch, resulting in the behavior I mentioned above?

EDIT: For what's worth, we tried a little experiment in Apple OSX (didn't have a linux box handy). This box has 4 cores plus hyperthreading so we spun up 8 programs with just a

while(True):
    i += 1

As expected, the Activity Monitor shows each of the 8 processes as consuming over 95% CPU (apparently with 4 cores and hyperthreading you get 800% total). We then spun up a ninth such program. Now all 9 run around 85%. Now kill the ninth guy and spin up a program with

while(True):
    i += 1
    time.sleep(0)

I was hoping that this process would use close to 0% and the other 8 would run 95%. But instead, all nine run around 85%. So on Apple OSX, sleep(0) appears to have no effect.

Matthew Lund
  • 3,742
  • 8
  • 31
  • 41

3 Answers3

38

I'd never thought about this, so I wrote this script:

import time

while True:
    print "loop"
    time.sleep(0.5)

Just as a test. Running this with strace -o isacontextswitch.strace -s512 python test.py gives you this output on the loop:

write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)                   = 5
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
write(1, "loop\n", 5)  

select() is a system call, so yes, you are context switching (ok technically a context switch is not actually necessary when you change to kernel space, but if you have other processes running, what you're saying here is that unless you have data ready to read on your file descriptor, other processes can run until then) into the kernel in order to perform this. Interestingly, the delay is in selecting on stdin. This allows python to interrupt your input on events such as ctrl+c input, should they wish, without having to wait for the code to time out - which I think is quite neat.

I should note that the same applies to time.sleep(0) except that the time parameter passed in is {0,0}. And that spin locking is not really ideal for anything but very short delays - multiprocessing and threads provide the ability to wait on event objects.

Edit: So I had a look to see exactly what linux does. The implementation in do_select (fs\select.c) makes this check:

if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
    wait = NULL;
timed_out = 1;
}

if (end_time && !timed_out)
    slack = select_estimate_accuracy(end_time);

In other words, if an end time is provided and both parameters are zero (!0 = 1 and evaluates to true in C) then the wait is set to NULL and the select is considered timed out. However, that doesn't mean the function returns back to you; it loops over all the file descriptors you have and calls cond_resched, thereby potentially allowing another process to run. In other words, what happens is entirely up to the scheduler; if your process has been hogging CPU time compared to other processes, chances are a context switch will take place. If not, the task you are in (the kernel do_select function) might just carry on until it completes.

I would re-iterate, however, that the best way to be nicer to other processes generally involves using other mechanisms than a spin lock.

  • That's helpful. I want to make sure I understand your conclusion. The time.sleep(0) is causing a context switch - right? But I think maybe my assumption was wrong that this would cause my program to be more friendly to other processes? – Matthew Lund Sep 01 '11 at 18:03
  • @Matthew there's two things - there's a context switch (switching task as in switching to another process) and switching to kernel mode - just switching to kernel mode doesn't necessarily mean you'll also give another process CPU time. Adding a delay probably (if there's something else running) will. This (sleep(0)) will definitely get you into kernel mode; it depends on the kernel in question as to whether asking for no delay instantly wakes your program up again, or if it looks for other processes that are also waiting for cpu time with expired timeouts on file descriptors. –  Sep 01 '11 at 18:09
  • Actually the select i not on stdin. First argument to select() is the number of filedescriptors in the three sets...in this case 0. This is a trick for implementing nanosecond based sleeptimes. – Robert Larsen Feb 04 '13 at 12:16
  • 1
    Does anybody know whether the effect of `sleep(0)` is similar on Windows? – pylipp Feb 21 '19 at 12:48
18

I think you have already the answer from @Ninefingers, but in this answer we will try to dive into python source code.

First the python time module is implemented in C and to see the time.sleep function implementation you can take a look at Modules/timemodule.c. As you can see (and without getting in all platform specific details) this function will delegate the call to the floatsleep function.

Now floatsleep is designed to work in different platform but still the behavior was designed to be the similar whenever it's possible, but as we are interested only in unix-like platform let's check that part only shall we:

...
Py_BEGIN_ALLOW_THREADS
sleep((int)secs);
Py_END_ALLOW_THREADS

As you can see floatsleep is calling C sleep and from sleep man page:

The sleep() function shall cause the calling thread to be suspended from execution until either the number of realtime seconds specified by the argument seconds has elapsed or ...

But wait a minute didn't we forgot about the GIL?

Well this is where Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros came in action (check Include/ceval.h if you are interested about the definition of this two macros), the C code above can be translated using this two macros to:

Save the thread state in a local variable.
Release the global interpreter lock.
... Do some blocking I/O operation ... (call sleep in our case)
Reacquire the global interpreter lock.
Restore the thread state from the local variable.

More information can be found about this two macro in the c-api doc.

Hope this was helpful.

mouad
  • 67,571
  • 18
  • 114
  • 106
10

You are basically attempting to usurp the job of the OS CPU scheduler. It would likely be much better to simply call os.nice(100) to inform the scheduler that you're very low priority so it can do its job properly.

Omnifarious
  • 54,333
  • 19
  • 131
  • 194
  • "A niceness of -20 is the highest priority and 19 is the lowest priority. "https://en.wikipedia.org/wiki/Nice_(Unix) `man 2 nice`: "The range of the nice value is +19 (low priority) to -20 (high priority). Attempts to set a nice value outside the range are clamped to the range." Also, nice is only applicable to UNIX-like systems (not a general OS), https://docs.python.org/2/library/os.html#os.nice . – Yaroslav Nikitenko Jun 13 '19 at 09:06
  • You are right. But if I want to change the niceness just locally? A part of my program is blocking on IO, and I want other parts to run as normal. The problem with nice is that after increasing niceness one can't decrease it again without superuser rights or ulimit. `man renice`: "an unprivileged user can only increase the ``nice value'' (i.e., choose a lower priority) and such changes are irreversible unless (since Linux 2.6.12) the user has a suitable ``nice'' resource limit (see ulimit(1) and getrlimit(2))." – Yaroslav Nikitenko Jun 13 '19 at 09:28
  • 1
    @YaroslavNikitenko - If it's blocking on IO, it's already not running, which is the lowest possible priority. :-) – Omnifarious Jun 13 '19 at 15:19