3

I am trying to execute the following code to run a python script with a configure file as input:

try:
    process = check_output(["python", "E:/SpaceWeather/Source_codes/codes_written/tec-suite-master/tecs.py", \
                       "-c",str(cfgdirectory/'new.cfg')],stderr=STDOUT,timeout = 300)
except subprocess.CalledProcessError as e:
    print(e.output)

I have written it this way as I was using Popen with process.communicate() before and it kept hanging. The way my script is designed loops through the check_output with different configure files, with the timeout parameter intended to end instances in the loop which take too long. However, when I run the script now, sometimes it randomly hangs for a particular loop instance and I'm not sure why. The timeout parameter does not seem to be ending the process and moving on to the next iteration in the loop?

Does anybody know hwy this might be happening?

NOTE: The script E:/SpaceWeather/Source_codes/codes_written/tec-suite-master/tecs.py which we input the configure file to is in a different place than my working directory, I'm not sure if this will make a difference.

EDIT: I attempted the following solution seen at Subprocess timeout failure:

import os
import signal
from subprocess import Popen, PIPE, TimeoutExpired
from time import monotonic as timer

start = timer()
with Popen('sleep 30', shell=True, stdout=PIPE, preexec_fn=os.setsid) as process:
    try:
        output = process.communicate(timeout=1)[0]
    except TimeoutExpired:
        os.killpg(process.pid, signal.SIGINT) # send signal to the process group
        output = process.communicate()[0]

and it seems to run through the loop fine. However, at the end of the loop the process does not end and hangs indefinitely. How can I force the process to actually terminate at the end of the loop?

EDIT 2: I edited some lines to account for being on a Windows platform, and all seems to run okay UNTIL one of the processes experiences a timeout. Thereafter, everything hangs. For some reason, even when processes do what they are supposed to they will not end!

with Popen(["python", "E:/SpaceWeather/Source_codes/codes_written/tec-suite-master/tecs.py", \
                   "-c",str(cfgdirectory/'new.cfg')],stdout=PIPE \
                        ,shell=False) as process:
    try:   
        output = process.communicate(timeout=300)[0]

    except TimeoutExpired:
        print('Forced kill at ' + str(observationDateTime))
#        os.kill(process.pid, signal.CTRL_C_EVENT)
        subprocess.call(['taskkill', '/F', '/T', '/PID', str(p.pid)])
        output = process.communicate()[0]
        print('done')

    except Exception as e:
        print('UNKNOWN EXCEPTION, TERMINATED')
#        os.kill(process.pid, signal.CTRL_C_EVENT)
        subprocess.call(['taskkill', '/F', '/T', '/PID', str(p.pid)])
        output = process.communicate()[0]
        print('done')

In the above code when a Forced kill at .... is printed, the done which should follow never occurs. Could this be because I'm looping through processes, therefore process is continuously changing and there is therefore difficulty in terminating it?

R Thompson
  • 353
  • 3
  • 15
  • did it throw TimeoutExpired exception when hanging happens?. The timeout parameter does not guarantee to end child process , you have to catch TimeoutExpired and call kill() on your own . but i did not find how to do it with process spawned by check_output – James Li Sep 02 '19 at 05:53
  • I have attempted a solution in the way in which you recommended, but now the process will not terminate at the end of the loop, even when it exceeds the timeout. Is there a way to brute force it to terminate? I have added more detail to my question. – R Thompson Sep 02 '19 at 06:10

1 Answers1

1

you are on the right track ,
just change signal.SIGINT to signal.SIGKILL

The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.

while SIGINT can be caught, blocked, or ignored.

sample code modified base on question
which will print "done" within 2 second
if replace SIGKILL with SIGTERM will print "done" as long as 10 seconds

import os
import signal
from subprocess import Popen, PIPE, TimeoutExpired
from time import monotonic as timer

start = timer()
with Popen('bash -c "trap \\\"\\\" 9 15; sleep 10;"', shell=True, stdout=PIPE, preexec_fn=os.setsid) as process:
    try:
        output = process.communicate(timeout=1)[0]
    except TimeoutExpired:
        print("forced kill")
        os.killpg(process.pid, signal.SIGKILL) # send signal to the process group
        output = process.communicate()[0]
        print("done")

James Li
  • 469
  • 3
  • 7
  • I have just given this a go and it is still hanging. I assume it is because a `TimeoutExpired` is not raising – R Thompson Sep 02 '19 at 06:49