34

I'm using subprocess to run a command line program from a Python (3.5.2) script, which I am running in a Jupyter notebook. The subprocess takes a long time to run and so I would like its stdout to be printed live to the screen in the Jupyter notebook.

I can do this no problem in a normal Python script run from the terminal. I do this using:

def run_command(cmd):
from subprocess import Popen, PIPE
import shlex

with Popen(shlex.split(cmd), stdout=PIPE, bufsize=1, universal_newlines=True) as p:
    for line in p.stdout:
        print(line, end='')
    exit_code = p.poll()
return exit_code

However, when I run the script in a Jupyter notebook, it does not print the stdout live to the screen. Instead, it prints everything after the subprocess has finished running.

Does anyone have any ideas on how to remedy this?

Many thanks, Johnny

Johnny Hunter
  • 341
  • 1
  • 3
  • 4
  • 1
    Try adding flush=True to your print – Padraic Cunningham Jul 27 '16 at 14:55
  • Thanks, Padraic. I tried that, but it didn't work. I also tried adding sys.stdout.flush() before the for loop, and that didn't work either. – Johnny Hunter Jul 27 '16 at 15:50
  • In what system and version of Jupyter are you running this? I ran your code with jupyter_client 4.3.0, jupyter_console 5.0.0 and jupyter_core 4.1.0 under Ubuntu and Python 3, and stdout was progressively printed as the process was generating it. – foglerit Jul 27 '16 at 16:43
  • What command are you running? – Padraic Cunningham Jul 27 '16 at 16:44
  • @jonnat, thanks! My versions are: jupyter==1.0.0 jupyter-client==4.3.0 jupyter-console==5.0.0 jupyter-core==4.1.0 – Johnny Hunter Jul 27 '16 at 17:09
  • @Padraic, I'm running another python script that does image fingerprinting. When I run the subprocess from within the parent script in PyCharm, it prints the stdout live, as I want it to. But this doesn't happen when I run the script in a Jupyter notebook. – Johnny Hunter Jul 27 '16 at 17:11
  • Thanks @JohnnyHunter. This is what I tried, can you check if you see the issue in this case? I created a file in the same folder as the notebook called slow.sh containing the single line "du; sleep 2; du; sleep 2; du". Then I ran your function as "run_command('bash slow.sh')". In my machine I see the results printed progressively. – foglerit Jul 27 '16 at 17:18
  • Hey guys. Sorry about the delayed response - I was away on holiday. @jonnat, I tried running your slow.sh script from the notebook, and it does indeed print the output live. I've figured out that the problem only occurs when running a command-line Python script from within another Python script. I've therefore just converted the command line script to a module and called that instead. Thanks for your help, Padraic and Jonnat! – Johnny Hunter Aug 10 '16 at 16:05

6 Answers6

34

The ipython notebook has it's own support for running shell commands. If you don't need to capture with subprocess stuff you can just do

cmd = 'ls -l'
!{cmd}

Output from commands executed with ! is automatically piped through the notebook.

cdleonard
  • 6,570
  • 2
  • 20
  • 20
4

If you set stdout = None (this is the default, so you can omit the stdout argument altogether), then your process should write its output to the terminal running your IPython notebook server.

This happens because the default behavior is for subprocess to inherit from the parent file handlers (see docs).

Your code would look like this:

from subprocess import Popen, PIPE
import shlex

def run_command(cmd):
    p = Popen(shlex.split(cmd), bufsize=1, universal_newlines=True)
    return p.poll()

This won't print to the notebook in browser, but at least you will be able to see the output from your subprocess asynchronously while other code is running.

Hope this helps.

Community
  • 1
  • 1
FluxLemur
  • 1,827
  • 16
  • 18
4

Jupyter mucks with stdout and stderr. This should get what you want, and give you a more useful exception when the command fails to boot.

import signal
import subprocess as sp


class VerboseCalledProcessError(sp.CalledProcessError):
    def __str__(self):
        if self.returncode and self.returncode < 0:
            try:
                msg = "Command '%s' died with %r." % (
                    self.cmd, signal.Signals(-self.returncode))
            except ValueError:
                msg = "Command '%s' died with unknown signal %d." % (
                    self.cmd, -self.returncode)
        else:
            msg = "Command '%s' returned non-zero exit status %d." % (
                self.cmd, self.returncode)

        return f'{msg}\n' \
               f'Stdout:\n' \
               f'{self.output}\n' \
               f'Stderr:\n' \
               f'{self.stderr}'


def bash(cmd, print_stdout=True, print_stderr=True):
    proc = sp.Popen(cmd, stderr=sp.PIPE, stdout=sp.PIPE, shell=True, universal_newlines=True,
                    executable='/bin/bash')

    all_stdout = []
    all_stderr = []
    while proc.poll() is None:
        for stdout_line in proc.stdout:
            if stdout_line != '':
                if print_stdout:
                    print(stdout_line, end='')
                all_stdout.append(stdout_line)
        for stderr_line in proc.stderr:
            if stderr_line != '':
                if print_stderr:
                    print(stderr_line, end='', file=sys.stderr)
                all_stderr.append(stderr_line)

    stdout_text = ''.join(all_stdout)
    stderr_text = ''.join(all_stderr)
    if proc.wait() != 0:
        raise VerboseCalledProcessError(proc.returncode, cmd, stdout_text, stderr_text)
egafni
  • 1,982
  • 1
  • 16
  • 11
1

Replacing the for loop with the explicit readline() call worked for me.

from subprocess import Popen, PIPE
import shlex

def run_command(cmd):
    with Popen(shlex.split(cmd), stdout=PIPE, bufsize=1, universal_newlines=True) as p:
        while True:
            line = p.stdout.readline()
            if not line:
                break
            print(line)    
        exit_code = p.poll()
    return exit_code

Something is still broken about their iterators, even 4 years later.

Antony Hatchkins
  • 31,947
  • 10
  • 111
  • 111
0

Use the subprocess.check_output function:

>>> subprocess.check_output(['echo', 'foobar'])
b'foobar\n'

For Python 3 you get back a bytes object which you can decode:

>>> b=subprocess.check_output(['echo', 'foobar'])
>>> b.decode().strip()

'foobar'

ntg
  • 12,950
  • 7
  • 74
  • 95
0

If you want to treat stdout and stderr separately, you can spawn two threads that handle them concurrently (live as the output is produced). This works in Jupyter notebooks as well as plain python interpreters / scripts.

Adapted from my more detailed answer:

import logging
from collections import deque
from concurrent.futures import ThreadPoolExecutor
from functools import partial
from subprocess import PIPE, CalledProcessError, CompletedProcess, Popen


def stream_command(
    args,
    *,
    stdout_handler=logging.info,
    stderr_handler=logging.error,
    check=True,
    text=True,
    stdout=PIPE,
    stderr=PIPE,
    **kwargs,
):
    """Mimic subprocess.run, while processing the command output in real time."""
    with Popen(args, text=text, stdout=stdout, stderr=stderr, **kwargs) as process:
        with ThreadPoolExecutor(2) as pool:  # two threads to handle the streams
            exhaust = partial(pool.submit, partial(deque, maxlen=0))
            exhaust(stdout_handler(line[:-1]) for line in process.stdout)
            exhaust(stderr_handler(line[:-1]) for line in process.stderr)
    retcode = process.poll()
    if check and retcode:
        raise CalledProcessError(retcode, process.args)
    return CompletedProcess(process.args, retcode)

Call with simple print handlers:

stream_command(["echo", "test"], stdout_handler=print, stderr_handler=print)
# test

Or with custom handlers:

outs, errs = [], []
def stdout_handler(line):
    outs.append(line)
    print(line)
def stderr_handler(line):
    errs.append(line)
    print(line)

stream_command(
    ["echo", "test"],
    stdout_handler=stdout_handler,
    stderr_handler=stderr_handler,
)
# test
print(outs)
# ['test']
ddelange
  • 1,037
  • 10
  • 24