3

Using subprocess.Popen is producing incomplete results where as subprocess.call is giving correct output

This is related to a regression script which has 6 jobs and each job performs same task but on different input files. And I'm running everything in parallel using SubProcess.Popen

Task is performed using a shell script which has calls to a bunch of C-compiled executables whose job is to generate some text reports followed by converting text report info into jpg images

sample of shell script (runit is the file name) with calling C-compile executables

#!/bin/csh -f
#file name : runit

#C - Executable 1
clean_spgs

#C - Executable 2
scrub_spgs_all file1
scrub_spgs_all file2

#C - Executable 3
scrub_pick file1 1000
scrub_pick file2 1000

while using subprocess.Popen, both scrub_spgs_all and scrub_pick are trying to run in parallel causing the script to generate incomplete results i.e. output text files doesn't contain complete information and also missing some of output text reports.

subprocess.Popen call is

resrun_proc = subprocess.Popen("./"+runrescompare, shell=True, cwd=rescompare_dir, stdout=subprocess.PIPE, stderr=subprocess.POPT, universal_newlines=True) 

where runrescompare is a shell script and has

#!/bin/csh

#some other text

./runit

Where as using subprocess.call is generating all the output text files and jpg images correctly but I can't run all 6 jobs in parallel.

resrun_proc = subprocess.call("./"+runrescompare, shell=True, cwd=rescompare_dir, stdout=subprocess.PIPE, stderr=subprocess.POPT, universal_newlines=True)

What is the correct way to call a C-exctuable from shell script using python subprocess calls where all 6 jobs can run in parallel(using python 3.5.1?

Thanks.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
Vamsi
  • 33
  • 1
  • 3

1 Answers1

3

You tried to simulate multiprocessing with subprocess.Popen() which does not work like you want: the output is blocked after a while unless you consume it, for instance with communicate() (but this is blocking) or by reading the output, but with 6 concurrent handles in a loop, you are bound to get deadlocks.

The best way is run the subprocess.call lines in separate threads.

There are several ways to do it. Small simple example with locking:

import threading,time

lock=threading.Lock()
def func1(a,b,c):
    lock.acquire()
    print(a,b,c)
    lock.release()
    time.sleep(10)

tl=[]
t = threading.Thread(target=func1,args=[1,2,3])
t.start()
tl.append(t)
t=threading.Thread(target=func1,args=[4,5,6])
t.start()
tl.append(t)

# wait for all threads to complete (if you want to wait, else
# you can skip this loop)

for t in tl:
    t.join()

I took the time to create an example more suitable to your needs:

2 threads executing a command and getting the output, then printing it within a lock to avoid mixup. I have used check_output method for this. I'm using windows, and I list C and D drives in parallel.

import threading,time,subprocess

lock=threading.Lock()

def func1(runrescompare,rescompare_dir):
    resrun_proc = subprocess.check_output(runrescompare, shell=True, cwd=rescompare_dir, stderr=subprocess.PIPE, universal_newlines=True)
    lock.acquire()
    print(resrun_proc)
    lock.release()

tl=[]
t=threading.Thread(target=func1,args=["ls","C:/"])
t.start()
tl.append(t)
t=threading.Thread(target=func1,args=["ls","D:/"])
t.start()
tl.append(t)

# wait for all threads to complete (if you want to wait, else
# you can skip this loop)

for t in tl:
    t.join()
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • It worked. I replaced my subprocess.Popen with threading.Thread and everything ran smoothly but the only difference is my program waits until all the threads are finished completely. – Vamsi Sep 02 '16 at 13:04
  • Oh, in that case, you can skip the last loop on `join`, I'm editing the answer. Of course please accept it if it works. – Jean-François Fabre Sep 02 '16 at 13:31
  • @Vamsi: you could [use a thread pool or async. io to get output from multiple external commands in parallel](http://stackoverflow.com/a/23616229/4279) – jfs Oct 29 '16 at 07:51