2

I am running some shell scripts with the subprocess module in python. If the shell scripts is running to long, I like to kill the subprocess. I thought it will be enough if I am passing the timeout=30 to my run(..) statement.

Here is the code:

try:
    result=run(['utilities/shell_scripts/{0} {1} {2}'.format(
                        self.language_conf[key][1], self.proc_dir, config.main_file)],
                shell=True,
                check=True,
                stdout=PIPE,
                stderr=PIPE, 
                universal_newlines=True, 
                timeout=30,
                bufsize=100)
except TimeoutExpired as timeout:

I have tested this call with some shell scripts that runs 120s. I expected the subprocess to be killed after 30s, but in fact the process is finishing the 120s script and than raises the Timeout Exception. Now the Question how can I kill the subprocess by timeout?

Max
  • 1,368
  • 4
  • 18
  • 43
  • have you tried legacy methods with `Popen` ? – Jean-François Fabre Feb 13 '18 at 09:37
  • 1
    What do you have in your `except` block? From the doc: "*The child process is not killed if the timeout expires, so in order to cleanup properly a well-behaved application should kill the child process and finish communication*" – cdarke Feb 13 '18 at 09:38
  • I have read the official document, it sends `SIGKILL` to kill the subprocess. Maybe your script cannot be killed by `SIGKILL`? Try it in raw terminal. – Sraw Feb 13 '18 at 09:38
  • @cdarke That is the behavior of `Popen` but not `run`. `run` will kill the child process. – Sraw Feb 13 '18 at 09:39
  • @Sraw: sorry, you are right. I would still like to know what is in the `except` block though. – cdarke Feb 13 '18 at 09:40
  • @Sraw `SIGKILL` never fails, at least in *nix. – llllllllll Feb 13 '18 at 09:44
  • @liliscent I'm not sure about it as wikipedia tells me there are some exceptions: https://en.wikipedia.org/wiki/Signal_(IPC)#POSIX_signals. See SIGKILL part. – Sraw Feb 13 '18 at 09:46
  • @cdarke: from the [docs for `subprocess.run`](https://docs.python.org/3/library/subprocess.html): "The timeout argument is passed to Popen.communicate(). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated." – Jean-François Fabre Feb 13 '18 at 10:02
  • @cdarke In the expect block is only a logging statement, that's it. – Max Feb 13 '18 at 11:38
  • For more background information. I am starting a shell script to execute some programs, which are send by students. This is part of some sort of automated testing of the uploads. Sometimes the students have some permanent loops `while True: ...`. These programs are blocking the testing chain, so I like to kill them after a timeout. The programs can be written in Python, C, matlab etc. That's why I am using shell scripts to start (and compile) them. – Max Feb 13 '18 at 11:44
  • Unless you are using shell built-in commands or shell meta-characters then you don't need a shell. Shell scripts can be run as any other program. – cdarke Feb 13 '18 at 14:13
  • I need shell comments – Max Feb 13 '18 at 16:07

1 Answers1

8

The documentation explicitly states that the process should be killed:

from the docs for subprocess.run:

"The timeout argument is passed to Popen.communicate(). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated."

But in your case you're using shell=True, and I've seen issues like that before, because the blocking process is a child of the shell process.

I don't think you need shell=True if you decompose your arguments properly and your scripts have the proper shebang. You could try this:

result=run(
  [os.path.join('utilities/shell_scripts',self.language_conf[key][1]), self.proc_dir, config.main_file],  # don't compose argument line yourself
            shell=False,  # no shell wrapper
            check=True,
            stdout=PIPE,
            stderr=PIPE, 
            universal_newlines=True, 
            timeout=30,
            bufsize=100)

note that I can reproduce this issue very easily on Windows (using Popen, but it's the same thing):

import subprocess,time

p=subprocess.Popen("notepad",shell=True)
time.sleep(1)
p.kill()

=> notepad stays open, probably because it manages to detach from the parent shell process.

import subprocess,time

p=subprocess.Popen("notepad",shell=False)
time.sleep(1)
p.kill()

=> notepad closes after 1 second

Funnily enough, if you remove time.sleep(), kill() works even with shell=True probably because it successfully kills the shell which is launching notepad.

I'm not saying you have exactly the same issue, I'm just demonstrating that shell=True is evil for many reasons, and not being able to kill/timeout the process is one more reason.

However, if you need shell=True for a reason, you can use psutil to kill all the children in the end. In that case, it's better to use Popen so you get the process id directly:

import subprocess,time,psutil

parent=subprocess.Popen("notepad",shell=True)
for _ in range(30): # 30 seconds
    if parent.poll() is not None:  # process just ended
      break
    time.sleep(1)
else:
   # the for loop ended without break: timeout
   parent = psutil.Process(parent.pid)
   for child in parent.children(recursive=True):  # or parent.children() for recursive=False
       child.kill()
   parent.kill()

(source: how to kill process and child processes from python?)

that example kills the notepad instance as well.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • In `run`'s source code, it also use `kill()` to terminate child process. So in this case, I don't think this will work. But who knows... – Sraw Feb 13 '18 at 09:48
  • @Sraw: I think it's the bloody shell=True which causes the issue. – Jean-François Fabre Feb 13 '18 at 10:08
  • I'm... still not sure about it as I have never seen this behavior. Since you said you have seen it, could you do us a favor by providing an example which will cause this situation? – Sraw Feb 13 '18 at 10:12
  • @Sraw actually I _can_. I'm using windows, but I'm sure it can be reproduced on Linux as well. – Jean-François Fabre Feb 13 '18 at 10:32
  • Because of some reasons, I need to execute the commands in the shell, so `shell=true` is necessary. When I am not using the statement I get an `FileNotFoundError`. Or is this error caused by an other false configuration? – Max Feb 13 '18 at 11:25
  • @Max: see my edit, it's possible with `Popen` and psutil package which kills all the children manually. – Jean-François Fabre Feb 13 '18 at 12:52
  • I have tested it with `subprocess.run`. And it doesn't work as expected. Surely after killing child process, `notepad` is still opening. But `subprocess.run` is also returned and raises a `TimeoutExpired` exception. So part of you have said is true, it won't fully kill `notepad`. But part is wrong, it won't block `subprocess.run` as `subprocess.run` is blocked by `shell` process but not all of its children. BTW, I tested it on Windows. – Sraw Feb 13 '18 at 13:15
  • the problem with `subprocess.run` is that you cannot get the process id / when exception is called the master process is already killed. That's why I'm recommending `Popen` instead – Jean-François Fabre Feb 13 '18 at 13:32
  • The psutil suggestion isn't reliable on Windows. A process only has the PID of its parent, which may have already exited. The way to do this reliably in Windows 8+ is with a Job object that disallows silent breakaway (or disallows breakaway entirely, but that may cause failures if a subprocess insists on breaking a child out of the job). Create the process suspended (avoid race conditions), assign it to the job, and then resume it. Windows 7 doesn't support nested jobs, so if Python is in a job already, you can try to break the child out, but it may be disallowed. – Eryk Sun Mar 26 '18 at 17:09
  • I encountered the same issue as OP but with Shell=False, using python 3.6.9 on Ubuntu 18 (but NOT with Python 3.8 on Windows). Turns out this is/was a bug https://bugs.python.org/issue37424 fixed in Python 3.7+ - process should now in fact be killed upon timeout and not continue. Not sure yet how to workaround the bug on Linux machines where I can't upgrade python. – AdamE Dec 23 '20 at 21:00
  • Update: The bug issue37424 above is "fixed" in Python 3.7.5 to be precise. Also, I upgraded via `sudo apt install python3.8` on Ubuntu 18 and now timeout is "honored" (thread is not blocked). Yet the actual process I called is not killed and continues (which may be a problem with my process not honoring termination call, dunno). – AdamE Dec 23 '20 at 21:23