Run program from Python and capture C output IN MANY THREADS

Question

I have a C program that takes two args and outputs a number.

./a.out 2 3 (for example).

It does some computationally expensive operations, so I was wondering if I could use Python's multiprocessing library to run a bunch of the C programs and then compile all the numbers into a list or table or some data structure?

Thanks.

This is not a duplicate because my questions is how can I do it IN PARALLEL (E.G. MANY THREADS)!

I don't know how I can have 1 Python program running a few hundred C programs and capturing all of the output into a Python list. Using process is 1 for 1 it seems.

score 2 · Answer 1 · answered Oct 13 '12 at 15:15

2

You can use a ThreadPool to run many tasks in parallel.

from multiprocessing.pool import ThreadPool
import subprocess
def f(x):
    a, b = x
    res = subprocess.check_output(["./a.out", str(a), str(b)])
    return int(res.strip())
p = ThreadPool()
results = p.map(f, [(2,3), (5,6), (9,10)])

answered Oct 13 '12 at 15:15

nneonneo

171,345
36
312
383

This seems to work, though I haven't tried it yet. Thanks! :) – Eiyrioü von Kauyf Oct 14 '12 at 00:33
Note that it will use a lot of memory (and mainly addresspace). If the number of external processes is large, or you are running on a 32-bit system, this might be an issue. – perh Oct 14 '12 at 18:41
@perh: I disagree. This creates only as many threads as you have CPUs, and therefore only as many simultaneous external processes as you have CPUs. So it won't use a lot of memory at all, unless `a.out` is an incredible memory hog. – nneonneo Oct 14 '12 at 18:46
True, I was considering the case where you are running a lot of external processes. The issue is that starting one thread per process basically means you are using two contexts per process instead of 1. – perh Oct 14 '12 at 20:11
Threads are pretty cheap, and they don't require a lot of memory to run (and `ThreadPool` reuses the threads, so it's not terribly wasteful either). I don't think it's a big deal. After all, it's not uncommon to have e.g. one thread per connection in a server. – nneonneo Oct 14 '12 at 20:13

score 1 · Answer 2 · answered Oct 13 '12 at 15:23

You can use subprocess.Popen to run multiple processes at once without using threads.

If the output from them is short enough to fit in the operating system buffers it is fairly easy:

To start a program asynchronously, use

subprocess.Popen(['command', args],stdout=subprocess.PIPE)

Just do that for all commands and place the result in an array.

Then:

 for process in subprocesses:
   process.wait()
   stdout,stderr = process.communicate()

This will not work if the subprocesses outputs a lot of data, becasuse wait() will deadlock: The process want's to write more, but the buffer is full, and you are waiting for the process to finish before you read.

In that case you will need to look into select.poll() or similar API:s

If he has a *few hundred* processes, this is going to create a large number of very busy processes which may not be ideal (especially if they all require large amounts of memory). Better to create the commands in batches, and wait on each batch. — nneonneo, Oct 14 '12 at 18:49

Dmitrii Tokarev · Answer 3 · 2012-10-14T05:30:44.410

You can try to use python's subprocess module, it allows you to start process, wait for it's ending and capture all it's output (stdout, stderr).

Here you can look at subprocess docs: http://docs.python.org/library/subprocess.html

You can look at my example:

#file t1.py

import time

def __main__():
    time.sleep(10)
    print(10)

if __name__ == "__main__":
    __main__()

#file: t2.py

import time
import subprocess

def __main__():
    N = 10
    V = 0

    pp = [subprocess.Popen("t1.py", stdout = subprocess.PIPE, shell = True) for _ in range(0, N)]
    oo = ["" for _ in range(0, N)]
    ff = [False for _ in range(0, N)]
    while True:
        for i in range(0, N):
            oo[i] += pp[i].stdout.read()
            if pp[i].poll() != None:
                ff[i] = True
        done = all(ff)
        if done:
            for o in oo:
                V += int(o)
            break
    print(V)

if __name__ == "__main__":
    __main__()

File t2.py does exactly what you want. File t1.py simulates your long running C program.

I edited my example because perh is right, there is no need in threads here since subprocess creates new process. He is also right that there can be a deadlock if your program has a huge output (bigger than the pipe length limit), so we have to read from pipe and wait for process to finish.

I think you should be able to replace your loop over `ff` with `done = all(ff)`. This will be true if all the values in `ff` are true and false if otherwise. — Sam Mussmann, Oct 13 '12 at 18:10
see... you say _process_ not processes .. that's my problem. in the problem statement, I want multiple running. ex a few large matrix calculations — Eiyrioü von Kauyf, Oct 14 '12 at 00:32
I say process meaning the interaction with one process, but you see in my example I am starting 10 processes, collect input from them adding this input to variable V, printing it and as expected getting 100 as output of t2.py. — Dmitrii Tokarev, Oct 14 '12 at 05:29
@TruvorSkameikin yes, from the same example though it seems that the output is being read synchronously. Is there no way to have asynchronous process input/output as in mpi .. but in python? — Eiyrioü von Kauyf, Oct 23 '12 at 06:33

score 0 · Answer 4 · edited May 23 '17 at 12:04

You might want to look into using Pythons ctypes module, which would let you compile your C program into a library and then call that library from your python script.

Also, the swig project will let you call C or C++ code from Python without too much trouble (see this answer).

If you go this route, you probably should look into using Threadpool or some other sort of mechanism to make the calls in parallel.

Run program from Python and capture C output IN MANY THREADS

4 Answers4