Forcing a thread to block all other threads from executing

Question

UPDATE:

This answer states that what I'm trying to do is impossible as of April 2013. This, however, seems to contradict what Alex Martelli says in Python Cookbook (p. 624, 3rd ed.):

Upon return, PyGILState_Ensure() always guarantees that the calling thread has exclusive access to the Python interpreter. This is true even if the calling C code is running a different thread that is unknown to the interpreter.

The docs also seem to suggest GIL can be acquired, which would give me hope (except I don't think I can call PyGILState_Ensure() from pure python code, and if I create a C extension to call it, I'm not sure how to embed my memory_daemon() in that).

Perhaps I'm misreading either the answer or Python Cookbook and the docs.

ORIGINAL QUESTION:

I want a given thread (from threading module) to prevent any other thread from running while a certain segment of its code is executing. What's the easiest way to achieve it?

Obviously, it would be great to minimize code changes in the other threads, to avoid using C and direct OS calls, and to make it cross-platform for windows and linux. But realistically, I'll be happy to just have any solution whatsoever for my actual environment (see below).

Environment:

CPython
python 3.4 (but can upgrade to 3.5 if it helps)
Ubuntu 14.04

Use case:

For debugging purposes, I calculate memory used by all the objects (as reported by gc.get_objects()), and print some summary report to sys.stderr. I do this in a separate thread, because I want this summary delivered asynchronously from other threads; I put time.sleep(10) at the end of the while True loop that does the actual memory usage calculation. However, the memory reporting thread takes a while to complete each report, and I don't want all the other threads to move ahead before the memory calculation is finished (otherwise, the memory snapshot will be really hard to interpret).

Example (to clarify the question):

import threading as th
import time

def report_memory_consumption():
  # go through `gc.get_objects()`, check their size and print a summary
  # takes ~5 min to run

def memory_daemon():
  while True:
    # all other threads should not do anything until this call is complete
    report_memory_consumption()
    # sleep for 10 sec, then update memory summary
    # this sleep is the only time when other threads should be executed
    time.sleep(10)


def f1():
  # do something, including calling many other functions
  # takes ~3 min to run

def f2():
  # do something, including calling many other functions
  # takes ~3 min to run


def main():
  t_mem = th.Thread(target = memory_daemon)
  t1 = th.Thread(target = f1)
  t2 = th.Thread(target = f2)
  t_mem.start()
  t1.start()
  t2.start()

# requirement: no other thread is running while t_mem is not sleeping

I believe, Python can execute only _one thread at a time_ because of the GIL. — ForceBru, Mar 28 '15 at 12:28

Saikiran Yerram · Answer 1 · 2015-03-29T02:04:31.050

3

You should use threading locks to execute code synchronously between threads. The answer given is somewhat correct but I would use reentrant locals to check again to see if you indeed have the lock.

Do not use variables as described in another answer to check for lock possession. The variables can get corrupted between multiple threads. Reentrant locks were meant to solve this problem.

Also what's incorrect in that code is that lock is released assuming the code between doesn't throw exception. so always do in with context or try-catch-finally.

Here is an excellent article explaining synchronization in Python and threading docs.

Edit: Answering OP's update on embedding Python in C

You misunderstood what he said in the cookbook. PyGILState_Ensure returns the GIL if a GIL is available in the current python interpreter but not C threads which is unknown to the python interpreter.

You can't force to get GIL from other threads in the current interpreter. Imagine if you were able to, then basically you will cannibalize all other threads.

edited Mar 29 '15 at 02:04

answered Mar 28 '15 at 15:51

Saikiran Yerram

2,994
3
17
20

See my edit to clarify the question. A simple `Lock` or `RLock` won't do the trick, since the other thread needs to be stopped whenever the "controlling thread" stops sleeping, regardless of where the "instruction pointer" happens to be in that other thread. (And of course, I can't insert a `Lock` check at every line of code in the other thread.) – max Mar 28 '15 at 23:12
Gotcha. So what begs the question is whether the execution should suspend immediately or should it finish its unit of execution before checking if it should suspend or run. If it is the latter, it's easier by simply looping `while should_run:` and then put code inside that loop. You can then update that flag in a listener that listens to messages from main thread on whether you should continue or suspend. – Saikiran Yerram Mar 28 '15 at 23:36
Yes, the `while should_run` construct would work for code that yields to being represented as a loop; unfortunately, my code is just a long sequence of operations. I would have to essentially sprinkle checks of `should_run` throughout the code, making this a rather cumbersome task, and creating a maintenance nightmare. – max Mar 28 '15 at 23:52

Albert · Accepted Answer · 2015-03-31T12:45:39.253

The Python Cookbook is correct. You have exclusive access to the Python interpreter at the point when PyGILState_Ensure() returns. Exclusive access means that you can safely call all CPython functions. And it means the current C thread is also the current active Python thread. If the current C thread did not have a corresponding Python thread before, PyGILState_Ensure() will have created one for you automatically.

That is the state right after PyGILState_Ensure(). And you also have the GIL acquired at that point.

However, when you call other CPython functions now, such as PyEval_EvalCode() or any other, they can implicitly make that the GIL gets released meanwhile. For example, that is the case if implicitly the Python statement time.sleep(0.1) gets called somewhere as a result. And while the GIL is released from this thread, other Python threads can run.

You only have the guarantee that when PyEval_EvalCode() (or whatever other CPython function you called) returns, you will again have the same state as before - i.e. you are on the same active Python thread and you again have the GIL.

About your original question: There currently is no way to achieve this, i.e. to call Python code and avoid that the GIL gets released as a result somewhere meanwhile. And this is a good thing, otherwise you could easily be end up in deadlocks, e.g. if you don't allow some other thread to release some lock which it currently holds.

About how to implement your use case: The only real way to do that is in C. You would call PyGILState_Ensure() to get the GIL. And at that point, you must only call those CPython functions which cannot have the side effect of calling other Python code. Be very careful. Even PyObj_DecRef() could call __del__. The best thing would be to avoid calling any CPython functions and manually traversing the CPython objects. Note that you probably don't have to do it as complicated as you outlined it: There is the underlying CPython memory allocator and I think you can just get the information from there.

Read here about the memory management in CPython.

Related code is in pymem.h, obmalloc.c and pyarena.c. See the function _PyObject_DebugMallocStats(), although that might not be compiled into your CPython.

There is also the tracemalloc module which however will add some overhead. Maybe its underlying C code (file _tracemalloc.c) is helpful however to understand the internals a bit better.

About sys.setswitchinterval(1000): That is related only for going through the Python bytecode and handling it. That is basically the main loop of CPython in PyEval_EvalFrameEx in the file ceval.c. There you'll find such part:

if (_Py_atomic_load_relaxed(&gil_drop_request))
    ...

All the logic with the switch interval is covered in the file ceval_gil.h.

Setting a high switch interval just means that the main loop in PyEval_EvalFrameEx will not be interrupted for a longer time. That does not mean that there aren't other possibilities that the GIL could get released meanwhile and that another thread could run.

PyEval_EvalFrameEx will execute the Python bytecode. Let's assume that this calls time.sleep(1). That will call the native C implementation of the function. You'll find that in time_sleep() in the file timemodule.c. If you follow that code, you'll find this:

Py_BEGIN_ALLOW_THREADS
err = select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &timeout);
Py_END_ALLOW_THREADS

Thus, the GIL gets released meanwhile. Now, any other thread which is waiting for the GIL could pick it up and run other Python code.

Theoretically, you could think, if you set a high switch interval and never call any Python code which in turn could release the GIL at some point, you would be safe. Note that this is almost impossible, though. E.g. the GC will get called from time to time and any __del__ of some objects could have various side effects.

Thanks this is very helpful. Can you comment on whether my answer works (with certain caveats)? — max, Mar 29 '15 at 22:59
Thanks. Is there any way to collect the data on how many thread switches happened while my program was running? — max, Mar 31 '15 at 09:36
@max: No. You would have to modify CPython. But that will be hard. Easier would be to implement the memory counting in C and just don't release the GIL while you do the calculation. I will extend my answer with some more information about that. — Albert, Mar 31 '15 at 11:48

score 1 · Answer 3 · edited May 23 '17 at 12:01

1

Python is always executing one thread at a time because of the Global Interpreter Lock. It doesn't do so when multiprocessing is involved. You can see this answer to learn more about the GIL in CPython.

Note, that's pseudocode as I don't know how you're creating threads/using them/which code you're executing in threads.

import threading, time

l=threading.Lock()
locked=False

def worker():
    l.acquire()
    locked=True
    #do something
    l.release()

def test():
    while locked:
        time.sleep(10)
    #do something

threads = []
t = threading.Thread(target=worker)
threads.append(t)
t = threading.Thread(target=test)
threads.append(t)
for th in threads:
    th.start()
for th in threads:
    th.join()

Certainly, it may be written better and can be optimized.

edited May 23 '17 at 12:01

Community

1
1

answered Mar 28 '15 at 12:34

ForceBru

43,482
10
63
98

4

This may be the case, but it doesn't prevent Python from releasing the GIL and switching threads during the critical section. – user2357112 Mar 28 '15 at 12:36
So, we have to use some C or C++ to lock/release the GIL, that's not what we can do with pure Python. – ForceBru Mar 28 '15 at 12:38
@ForceBru Hmm.. I asked for the easiest way to solve this, but if there's no way to do it in pure python, a solution in C is still better than no solution at all! :) – max Mar 28 '15 at 12:53
@max, you can use some [locks](https://docs.python.org/3/library/threading.html) to do this. For example, make some threads pause while a certain lock is locked and make them resume their work when it's released. – ForceBru Mar 28 '15 at 12:58
@ForceBru but how? I don't know where in the other threads the execution happens to be occurring when the interpreter chooses to switch to any of them, and I don't know how to check a lock in *every* line of code in a thread (I think it's impossible?) – max Mar 28 '15 at 13:01
First of all saying GIL prevents execution of threads is wrong. The correct statement is that only one thread can execute at a time. When a thread yields (time.sleep, lock.acquire, IO call), it releases GIL allowing other threads to execute. Also in the above code, use reentrant locals to check again to see if you indeed have the lock. Don't use variables to check if lock is available or not because it corrupt the value. – Saikiran Yerram Mar 28 '15 at 15:46
@SaikiranYerram, you may have misread the answer, as for the GIL, I'm saying _exactly the same thing_ you mentioned! As for the code, any improvements are welcome: it was mentioned that this code can be written better. – ForceBru Mar 28 '15 at 15:52
I have added my answer. I would use Rentrant locks to check lock possession and acquire/release in context or `try-catch-finally` to ensure locks are always released. Not sure why `time.sleep` because Python will yield but won't execute any code since the thread is holding the lock. – Saikiran Yerram Mar 28 '15 at 15:59
Sorry, I wasn't very clear in my question. I edited it to clarify. Using your code won't work for me because `worker()` needs to execute in a loop with some sleep inserted in between, while `test()` should only execute while `worker()` is sleeping and should be suspended otherwise. It does seem that it would require writing a C extension, or worse. – max Mar 28 '15 at 23:10

score 1 · Answer 4 · answered Mar 29 '15 at 00:39

As a stop-gap solution (for obvious reasons), the following worked for me:

def report_memory_consumption():
  sys.setswitchinterval(1000) # longer than the expected run time
  # go through `gc.get_objects()`, check their size and print a summary
  # takes ~5 min to run
  sys.setswitchinterval(0.005) # the default value

If anyone has a better answer, please post it.

Forcing a thread to block all other threads from executing

4 Answers4

Linked