4

I'm writing a python script that launches programs in the background and then monitors to see if they encounter an error. I am using the subprocess module to start the process and keep a list of running programs.

processes.append((subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE), command))

I have found that when I try to monitor the programs by calling communicate on the subprocess object, the main program waits for the program to finish. I have tried to use poll(), but that doesn't give me access to the error code that caused the crash and I would like to address the issue and retry opening the process. runningProcesses is a list of tuples containing the subprocess object and the command associated with it.

def monitorPrograms(runningProcesses):
    for program in runningProcesses:
        temp = program[0].communicate()
        if program[0].returncode:
            if program[0].returncode == 1:
                print "Program exited successfully."
            else:
                print "Whoops, something went wrong. Program %s crashed." % program[0].pid

When I tried to get the return code without using communicate, the crash of the program didn't register. Do I have to use threads to run the communication in parallel or is there a simpler way that I am missing?

Juliuszc
  • 377
  • 2
  • 4
  • 16
  • 1
    You program will never print "Program exited successfully" because it can never be 0 given the outer if condition. You could use a timeout in `communicate()` in a try, except block and recycle if it times out. – AChampion Sep 28 '15 at 17:48

2 Answers2

2

No need to use threads, to monitor multiple processes, especially if you don't use their output (use DEVNULL instead of PIPE to hide the output), see Python threading multiple bash subprocesses?

Your main issue is incorrect Popen.poll() usage. If it returns None; it means that the process is still running -- you should call it until you get non-None value. Here's a similar to your case code example that prints ping processes statuses.

If you do want to get subprocess' stdout/stderr as a string then you could use threads, async.io.

If you are on Unix and you control all the code that may spawn subprocesses then you could avoid polling and handle SIGCHLD yourself. asyncio stdlib library may handle SIGCHLD. You could also implement it manually, though it might be complicated.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • Sorry if my initial question was unclear. I am trying to communicate with the open process so that I could receive the output after it crashes, so that I can hopefully fix the issue, i.e resolve dependencies. So should I do something along the lines of the second example in Python threading multiple bash subprocesses?, but instead of call use .communicate()? I thought that if I use Popen.poll() that I will lose the stdout and only get the return code, which wouldn't help me resolve the cause of the crash. – Juliuszc Sep 29 '15 at 14:43
  • @Juliuszc: it is safe to call `Popen.poll()` — it is a nonblocking way to ask a subprocess: "are you dead and if you are then give me your exit status". Though if you do need to get subprocess' stdout/stderr then you don't need to poll: `Popen.communicate()` among other things reaps the child process and sets `Popen.returncode` (indirectly). See [code examples I've linked above](http://stackoverflow.com/a/23616229/4279). – jfs Sep 29 '15 at 20:23
1

Based on my research, the best way to do this is with threads. Here's an article that I referenced when creating my own package to solve this problem.

The basic method used here is to spin of threads that constantly request log output (and finally the exit status) of the subprocess call.

Here's an example of my own "receiver" which listens for logs:

class Receiver(threading.Thread):
    def __init__(self, stream, stream_type=None, callback=None):
        super(Receiver, self).__init__()
        self.stream = stream
        self.stream_type = stream_type
        self.callback = callback
        self.complete = False
        self.text = ''

    def run(self):
        for line in iter(self.stream.readline, ''):
            line = line.rstrip()
            if self.callback:
                line = self.callback(line, msg_type=self.stream_type)
            self.text += line + "\n"
        self.complete = True

And now the code that spins the receiver off:

 def _execute(self, command):
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
                               shell=True, preexec_fn=os.setsid)
    out = Receiver(process.stdout, stream_type='out', callback=self.handle_log)
    err = Receiver(process.stderr, stream_type='err', callback=self.handle_log)
    out.start()
    err.start()
    try:
        self.wait_for_complete(out)
    except CommandTimeout:
        os.killpg(process.pid, signal.SIGTERM)
        raise
    else:
        status = process.poll()
        output = CommandOutput(status=status, stdout=out.text, stderr=err.text)
        return output
    finally:
        out.join(timeout=1)
        err.join(timeout=1)

CommandOutput is simply a named tuple that makes it easy to reference the data I care about.

You'll notice I have a method 'wait_for_complete' which waits for the receiver to set complete = True. Once complete, the execute method calls process.poll() to get the exit code. We now have all stdout/stderr and the status code of the process.

jesseops
  • 396
  • 1
  • 6