13

I'm running some subprocesses from python in parallel. I want to wait until every subprocess have finished. I'm doing a non elegant solution:

runcodes = ["script1.C", "script2.C"]
ps = []
for script in runcodes:
  args = ["root", "-l", "-q", script]
  p = subprocess.Popen(args)
  ps.append(p)
while True:
  ps_status = [p.poll() for p in ps]
  if all([x is not None for x in ps_status]):
    break

is there a class that can handle multiple subprocess? The problem is that the wait method block my program.

update: I want to show the progress during the computation: something like "4/7 subprocess finished..."

If you are curious root compile the c++ script and execute it.

Ruggero Turra
  • 16,929
  • 16
  • 85
  • 141

5 Answers5

11

If your platform is not Windows, you could probably select against the stdout pipes of your subprocesses. Your app will then block until either:

  • One of the registered file descriptors has an I/O event (in this case, we're interested in a hangup on the subprocess's stdout pipe)
  • The poll times out

Non-fleshed-out example using epoll with Linux 2.6.xx:

import subprocess
import select

poller = select.epoll()
subprocs = {} #map stdout pipe's file descriptor to the Popen object

#spawn some processes
for i in xrange(5):
    subproc = subprocess.Popen(["mylongrunningproc"], stdout=subprocess.PIPE)
    subprocs[subproc.stdout.fileno()] = subproc
    poller.register(subproc.stdout, select.EPOLLHUP)

#loop that polls until completion
while True:
    for fd, flags in poller.poll(timeout=1): #never more than a second without a UI update
        done_proc = subprocs[fd]
        poller.unregister(fd)
        print "this proc is done! blah blah blah"
        ...  #do whatever
    #print a reassuring spinning progress widget
    ...
    #don't forget to break when all are done
Jeremy Brown
  • 17,880
  • 4
  • 35
  • 28
  • This is neat! Is there a way to get `subproc.stdout` to print to the terminal while `mylongrunningproc` is running? – unutbu Jul 07 '10 at 14:54
  • 1
    The first thing that comes to mind is to also register the stdout pipes for input events - `poller.register(subproc.stdout, select.EPOLLHUP | select.EPOLLIN)`. Then you can do `if flags & select.EPOLLIN: print done_proc.stdout.readline()`. You would have to be careful about blocking indefinitely in case if the output is not line-delmited though. In linux I _think_ you can get around that by using `fcntl` to set the stdout pipe to be non-blocking and then catching IOError with errno=EAGAIN. Ex - `fcntl.fcntl(subproc.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)` – Jeremy Brown Jul 07 '10 at 15:49
  • reference link on non-blocking reads of the pipe - (http://www.gossamer-threads.com/lists/python/dev/658205) – Jeremy Brown Jul 07 '10 at 15:52
  • If the solution is based on the select syscall maybe a better higher level abstraction is asyncio which ultimately is based on that. I will create another answer for that. – turbopapero Feb 04 '22 at 23:26
8

How about

import os, subprocess
runcodes = ["script1.C", "script2.C"]
ps = {}
for script in runcodes:
    args = ["root", "-l", "-q", script]
    p = subprocess.Popen(args)
    ps[p.pid] = p
print "Waiting for %d processes..." % len(ps)
while ps:
    pid, status = os.wait()
    if pid in ps:
        del ps[pid]
        print "Waiting for %d processes..." % len(ps)
Marius Gedminas
  • 11,010
  • 4
  • 41
  • 39
7

You could do something like this:

runcodes = ["script1.C", "script2.C"]

ps = []
for script in runcodes:
    args = ["root", "-l", "-q", script]
    p = subprocess.Popen(args)
    ps.append(p)

for p in ps:
    p.wait()

The processes will run in parallel, and you'll wait for all of them at the end.

nosklo
  • 217,122
  • 57
  • 293
  • 297
  • 5
    yes, the problem is that I can't write `#process finished` during the executions because suppose the first subprocess is very slow, p is equal to the first `ps` and python is waiting and frozen until the first finishes; python can't write that all the subprocess except the first are finished. – Ruggero Turra Jul 07 '10 at 11:38
  • If I'd like to wait first finish process and when kill another? – andreykyz May 13 '20 at 12:51
1

This answer is related to this answer and uses a similar mechanism underneath (based on the select syscall) called asyncio. You can read more about asyncio here.

Asyncio is good when your process is IO-bound. Your process seems to be IO-bound, at least in this section of your program, spending most of its time waiting for the external scripts to complete and only printing a message at their end.

The following code should work for you (with some minor adjustments perhaps):

import asyncio

# The scripts you want to run concurrently
runcodes = ["script1.C", "script2.C"]

# An awaitable coroutine that calls your script
# and waits (non-blocking) until the script is done 
# to print a message
async def run_script(script):
    # You will need to adjust the arguments of create_subprocess_exec here
    # according to your needs
    p = await asyncio.create_subprocess_exec(script)
    await p.wait()
    print("Script", script, "is done")

# You create concurrent tasks for each script
# they will start in parallel as soon as they are
# created
async def main():
    tasks = []
    for script in runcodes:
        tasks.append(asyncio.create_task(run_script(script)))

    # You wait until all the tasks are done before 
    # continuing your program
    for task in tasks:
        await task

if __name__ == "__main__":
    asyncio.run(main())

Some detailed explanation.

Asyncio allows you to execute concurrent tasks with a single thread by alternating the various asynchronous tasks and simply waiting when all of them are blocked.

The function run_script is asynchronous and will call your script using a similar mechanism to subprocess.Popen. The difference here is that the returning object is awaitable meaning that you can jump to something else while the wait function is blocked.

You can read more about subprocess management with asyncio here. You will notice the management of the subprocess is very similar to the "normal" Python subprocessing. The arguments of the Popen are also similar.

Notice that this is different to threading. In fact, this program is single-threaded. Do not confuse asyncio with multi-threading. They are two different approaches to run tasks concurrently (with pros and cons).

The main function will create multiple tasks, one for each script you want to run and wait on them.

The important thing is that this await will not be blocking and at the same time it will not do live-polling. It will sleep until any of the tasks will be ready. Once a task is ready the execution returns to that task that can print a statement about your message.

The program will not exit the main function until all awaited tasks are done, though. It will stay within the loop of awaited functions generated by asyncio.run until everything is completed.

turbopapero
  • 932
  • 6
  • 17
0

I think the answer is not in python code or language features but it is in the system capabilities , consider this solution :

runcodes = ["script1.C", "script2.C"]

Args = []
for script in runcodes:
    Args += " ".join(["root", "-l", "-q", script])


P = subprocess.Popen(" & ".join(Args))
P.wait()
Mhadhbi issam
  • 197
  • 3
  • 6