2

Note this question is not the same as Python Subprocess.Popen from a thread, because that question didn't seek an explanation on why it is ok.

If I understand correctly, subprocess.Popen() creates a new process by forking the current process and execv new program.

However, if the current process is multithreaded, and we call subprocess.Popen() in one of the thread, won't it be duplicating all the threads in the current process (because it calls syscall fork())? If it's the case, though these duplicated threads will be wiped out after syscall execv, there's a time gap in which the duplicated threads can do a bunch of nasty stuff.

A case in point is gtest_parallel.py, where the program creates a bunch of threads in execute_tasks(), and in each thread task_manager.run_task(task) will call task.run(), which calls subprocess.Popen() to run a task. Is it ok?

The question applies to other fork-in-thread programs, not just Python.

Leedehai
  • 3,660
  • 3
  • 21
  • 44

1 Answers1

1

Forking only results in the calling thread being active in the fork, not all threads.. Most of the pitfalls related to forking in a multi-threaded program are related to mutexes being held by other threads that will never be released in the fork. When you're using Popen, you're going to launch some unrelated process once you execv, so that's not really a concern. There is a warning in the Popen docs about being careful with multiple threads and the preexec_fn parameter, which runs before the execv call happens:

Warning The preexec_fn parameter is not safe to use in the presence of threads in your application. The child process could deadlock before exec is called. If you must use it, keep it trivial! Minimize the number of libraries you call into.

I'm not aware of any other pitfalls to watch out for with Popen, at least in recent versions of Python. Python 2.7's subprocess module does seem to have flaws that can cause issues with multi-threaded applications, however.

dano
  • 91,354
  • 19
  • 222
  • 219
  • Wow thank you! It leads me to the POSIX 2018 spec: "A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources." (https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html) May add this to your answer as well, for the benefit of future readers. – Leedehai Jun 12 '20 at 19:23