4

I would like to test how my application responds to functions that hold the GIL. Is there a convenient function that holds the GIL for a predictable (or even a significant) amount of time?

My ideal function would be something that operated like time.sleep except that, unlike sleep, it would hold the GIL

MRocklin
  • 55,641
  • 23
  • 163
  • 235
  • 2
    Functions don't hold the GIL (and can't). The Python interpreter loop does. Why do you think you need this? – Martijn Pieters Aug 23 '16 at 14:11
  • Perhaps you are looking for a **C** callable that doesn't explicitly tell the interpreter to release the GIL while it runs? – Martijn Pieters Aug 23 '16 at 14:12
  • 2
    I think I need this because I want to test how my distributed system performs when asked to run functions that cause the interpreter loop to hold the GIL for long periods of time: http://distributed.readthedocs.io/en/latest/ – MRocklin Aug 23 '16 at 14:14
  • That package is built in Pure Python and uses multiprocessing. There is no need to worry about the GIL there, *it doesn't apply there*. – Martijn Pieters Aug 23 '16 at 14:16
  • At any rate, there are *no* Pure Python functions that hold the GIL for long periods of time, because they can't hold the GIL. The GIL is a C-level entity, that C extensions can release to let the interpreter thread execute some more Python bytecode. – Martijn Pieters Aug 23 '16 at 14:23
  • 2
    There are functions, accessible from Python, that do hold the GIL. These functions may themselves call C code that does not release the GIL. – MRocklin Aug 23 '16 at 15:22
  • No, they *don't hold the GIL*. They do not signal that the lock can be released, that's a *big difference*. And again, **you don't need to worry about the GIL**, because you are not using threading. – Martijn Pieters Aug 23 '16 at 15:23
  • 2
    I'm not sure where you're getting that from. I do use theads for computation, I do need to worry about the GIL, and for my purposes there isn't a huge difference. If someone calls a function, say `pandas.DataFrame.merge`, which currently causes the GIL to be held for a long time, then my I/O will cease for a while. I'm getting bad behavior because of this and I'd like to test it. – MRocklin Aug 23 '16 at 15:32
  • I'm curious if there was ever any further developments here. Specifically - I'm confused as to how a `pandas.DataFrame.merge` would itself block the GIL. In the situation of needing consistent communication with a worker in a distributed situation, should that opportunity be provided every n ticks? Given that Numpy operations occur outside the GIL - what specific aspect would be blocking/preventing the worker's communication? Unless the Numpy step constitutes a "long tick" (interpreter instruction) that itself creates a timeout (?). – kuanb Jul 07 '17 at 19:25

2 Answers2

4

A simple, but hacky, way to hold the GIL is to use the re module with a known-to-be-slow match:

import re
re.match(r'(a?){30}a{30}', 'a'*30)

On my machine, this holds the GIL for 48 seconds with Python 2.7.14 (and takes almost as long on 3.6.3). However, this relies on implementation details, and may stop working if the re module gets improvements.

A more direct approach would be to write a c module that just sleeps. Python C extensions don't automatically release the GIL (unlike, say, ctypes). Follow the hellomodule example here, and replace the printf() with a call to sleep() (or the Windows equivalent). Once you build the module, you'll have a GIL holding function you can use anywhere.

itsadok
  • 28,822
  • 30
  • 126
  • 171
2

You can use a C library's sleep function in "PyDLL" mode.

# Use libc in ctypes "PyDLL" mode, which prevents CPython from
# releasing the GIL during procedure calls.
_libc_name = ctypes.util.find_library("c")
if _libc_name is None:
    raise RuntimeError("Cannot find libc")
libc_py = ctypes.PyDLL(_libc_name)
...
libc_py.usleep(...)

(See https://gist.github.com/jonashaag/d455671003205120a864d3aa69536661 for details on how to pickle the reference, for example if using in a distributed computing environment.)

Jonas H.
  • 2,331
  • 4
  • 17
  • 23
  • 1
    libc functions can be accessed without a name using `libc_py = ctypes.PyDLL(None)` so you can skip the lookup step – minrk Aug 25 '21 at 07:28