1

I have a Python 3.7 project

It is using a library which uses subprocess Popen to call out to a shell script.

I am wondering: if were to put the library calls in a separate thread, would I be able to do work in the main thread while waiting for the result from Popen in the other thread?

There is an answer here https://stackoverflow.com/a/33352871/202168 which says:

The way Python threads work with the GIL is with a simple counter. With every 100 byte codes executed the GIL is supposed to be released by the thread currently executing in order to give other threads a chance to execute code. This behavior is essentially broken in Python 2.7 because of the thread release/acquire mechanism. It has been fixed in Python 3.

Either way does not sound particularly hopeful for what I want to do. It sounds like if the "library calls" thread has not hit the 100 bytecode trigger point when the call to Popen.wait is made then probably it will not switch to my other thread and the whole app will wait for the subprocess?

Maybe this info is wrong however.

Here is another answer https://stackoverflow.com/a/16262657/202168 which says:

...the interpreter can always release the GIL; it will give it to some other thread after it has interpreted enough instructions, or automatically if it does some I/O. Note that since recent Python 3.x, the criteria is no longer based on the number of executed instructions, but on whether enough time has elapsed.

This sounds more hopeful, since presumably communicating with the subprocess would involve I/O and might therefore allow a context switch for my main thread to be able to proceed in the meantime. (or perhaps just elapsed time waiting on the wait would cause a context switch)

I am aware of https://docs.python.org/3/library/asyncio-subprocess.html which explicitly solves this problem, but I am calling a 3rd-party library which just uses plain subprocess.Popen.

Can anyone confirm if the "subprocess calls in a separate thread" idea is likely to be useful to me, in Python 3.7 specifically?

Anentropic
  • 32,188
  • 12
  • 99
  • 147

1 Answers1

2

I had time to make an experiment, so I will answer my own question...

I set up two files:

mainthread.py

#!/usr/bin/env python
import subprocess
import threading
import time


def run_busyproc():
    print(f'{time.time()} Starting busyprocess...')
    subprocess.run(["python", "busyprocess.py"])
    print(f'{time.time()} busyprocess done.')


if __name__ == "__main__":
    thread = threading.Thread(target=run_busyproc)
    print("Starting thread...")
    thread.start()
    while thread.is_alive():
        print(f"{time.time()} Main thread doing its thing...")
        time.sleep(0.5)
    print("Thread is done (?)")
    print("Exit main.")

and busyprocess.py:

#!/usr/bin/env python
from time import sleep


if __name__ == "__main__":
    for _ in range(100):
        print("Busy...")
        sleep(0.5)
    print("Done")

Running mainthread.py from the command-line I can see that there is the context-switch that you would hope to see - main thread is able to do work while waiting on the result of the subprocess:

Starting thread...
1555970578.20475 Main thread doing its thing...
1555970578.204679 Starting busyprocess...

Busy...
1555970578.710308 Main thread doing its thing...
Busy...
1555970579.2153869 Main thread doing its thing...
Busy...
1555970579.718168 Main thread doing its thing...
Busy...
1555970580.2231748 Main thread doing its thing...
Busy...
1555970580.726122 Main thread doing its thing...
Busy...
1555970628.009814 Main thread doing its thing...

Done
1555970628.512945 Main thread doing its thing...

1555970628.518155 busyprocess done.
Thread is done (?)
Exit main.

Good news everybody, python threading works :)

Anentropic
  • 32,188
  • 12
  • 99
  • 147