0

The below code tries to run multiple commands in parallel and each command has a timeout. If processing is not completed by timeout it is to be stopped (I am using terminate()).

The issue is after termination (returncode is set to -ve) the communicate() method hangs and when forced exit (Ctrl+C) then following error is displayed.

(stdout, stderr) = proc.communicate()
File "python3.7/subprocess.py", line 926, in communicate
stdout = self.stdout.read()

Code

procList = []
for app in appList:
    try:
        p = subprocess.Popen(app['command'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
        procList.append((app, p))
    except Exception as e:
        print(e)

start = time.time()
while len(procList):
    time.sleep(30)
    try:
        for app, proc in procList:
            if (time.time() - start > app['timeoutSec']):
                proc.terminate()
            if proc.poll() is not None and app['mailSent'] == 0:
                (stdout, stderr) = proc.communicate() #Hangs here is the process is terminated
                send_results_mail('Execution Completed or Terminated')
                app['mailSent'] = 1
    except subprocess.SubprocessError as e:
        print(e)
    procList = [(app, proc) for (app, proc) in procList if app['mailSent'] == 0]
Roopesh90
  • 80
  • 1
  • 7

2 Answers2

0

EDIT: Here is a working example, using kill(), with a misbehaving child (that won't respond to terminate()) but I'm not sure of the nature of your child process. Hope this helps get closer to a solution for you!

Simple child program, Python 3:

import signal
import time

def sighandler(signal, _stack):
    print(f"Ignoring signal {signal}")

signal.signal(signal.SIGTERM, sighandler)
signal.signal(signal.SIGINT, sighandler)

secs = 10
print(f"Sleeping {secs} seconds...")
time.sleep(secs)
print("Exiting...")

Revised parent program, Python 3:

import subprocess
import time

start = time.time()
appList = [
   {'command': 'python ./child.py', 'timeoutSec': 2, 'mailSent': 0},
   {'command': 'python ./child.py', 'timeoutSec': 2, 'mailSent': 0},
   {'command': 'python ./child.py', 'timeoutSec': 2, 'mailSent': 0},
]

def logmsg(msg):
    elap = time.time()-start
    print(f"{elap:2.1f} secs: {msg}")

def send_results_mail(result):
    logmsg(f"Result: {result}")

procList = []
for app in appList:
    try:
        p = subprocess.Popen(app['command'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
        procList.append((app, p))
    except Exception as e:
        logmsg(e)
    logmsg(f"Launching child... {p.pid}")

start = time.time()
while len(procList):
    time.sleep(1)
    try:
        for app, proc in procList:
            if (time.time() - start > app['timeoutSec']):
                logmsg(f"Trying to terminate()...{proc.pid}")
                proc.terminate()
                proc.kill()
            if proc.poll() is not None and app['mailSent'] == 0:
                proc.kill()
                logmsg(f"Trying to communicate()...{proc.pid}")
                (stdout, stderr) = proc.communicate()
                send_results_mail('Execution Completed or Terminated')
                app['mailSent'] = 1
    except subprocess.SubprocessError as e:
        logmsg(e)
    procList = [(app, proc) for (app, proc) in procList if app['mailSent'] == 0]

Output:

$ python parent.py
0.0 secs: Launching child... 537567
0.0 secs: Launching child... 537568
0.0 secs: Launching child... 537569
2.0 secs: Trying to terminate()...537567
2.0 secs: Trying to terminate()...537568
2.0 secs: Trying to terminate()...537569
3.0 secs: Trying to terminate()...537567
3.0 secs: Trying to communicate()...537567
3.0 secs: Result: Execution Completed or Terminated
3.0 secs: Trying to terminate()...537568
3.0 secs: Trying to communicate()...537568
3.0 secs: Result: Execution Completed or Terminated
3.0 secs: Trying to terminate()...537569
3.0 secs: Trying to communicate()...537569
3.0 secs: Result: Execution Completed or Terminated
Chuck T.
  • 766
  • 4
  • 7
  • Thanks for the response, The example you have shared is for, timeout of communicate() whereas I am working with timeout for the process. So that part can not be removed. Also, adding timeout results in while loop going over and then proc.kill() getting executed, still it is not terminating the process. – Anubhav Bansal Feb 25 '20 at 07:27
  • Thanks for your feedback, @AnubhavBansal I tried to adjust the response to show a working example with the use of kill(). I hope this helps you get closer to a solution. – Chuck T. Feb 25 '20 at 22:33
0

Thanks for all the suggestions and responses, but nothing solved the problem as for me the processes I was spanning were creating multiple levels of sub-processes.

So I had to terminate the processes recursively and for that I used the solution using psutil

Pls see the post below for details

https://stackoverflow.com/a/27034438/2393961

once all the child and grandchild processes are killed communicate() works fine.

Another, piece I learnt is even though htop can show you the tree structure of the process but each process is independent, killing a parent does not automatically kill its decedents. Thanks to my friend Noga for pointing that out