0

Take this code as an example (tar can compress by -z -J -j and there is a tarfile specific module, i know, but it's to represent a long running process)

    from subprocess import Popen, PIPE
    with open('tarball.tar.gz', 'w+') as tarball:
        tarcmd = Popen(['tar', '-cvf', '-', '/home'], stdout=PIPE)
        zipcmd = Popen(['gzip', '-c'], stdin=tarcmd.stdout, stdout=tarball)
        tarcmd.stdout.close()
        zipcmd.communicate()
        # added a while loop that breaks when tarcmd gets a
        # proper return value. Can it be considerate a good
        # solution?
        while tarcmd.poll() is None:
            print('waiting...')

        # test the values and do stuff accordingly

This is the typical example of piping two commands in python subprocess. Now checking the return code of the zipcmd is easy, but how to check if tarcmd fails? if i check its returncode i always get none (i think because stdout it's closed). Basically i wanna raise an exception if one of the two command fails. In bash there is $PIPESTATUS, how can i do it in python?

  • From documentation of the subprocess module, about checking processes returncode: A None value indicates that the process hasn’t terminated yet. Your tarcmd is still running OR you didn't check it's result, wait for it to end using .wait() or .poll(), those methods set the correct returncode value. – Maciek Jun 07 '16 at 12:19
  • Thanks, i added a while loop that breaks when tarcmd gets a proper returncode and it's seems to work now! (it can be consider a good solution?). I also tried tarcmd.communicate() but it doesn't work. If a put it after tar.stdout.close() whether if the command fails or it succeeds, i always get a "ValueError: I/O operation on closed file". If i put it before tar.stdout.close(), since communicate() waits until the command it's over, it doesn't close stdout and doesn't allow the tarcmd to receive a sigpipe if gzip ends prematurely. Any further help it's appreciated – Egidio Docile Jun 07 '16 at 14:00

1 Answers1

1

if i check its returncode i always get none

If the value is None then it means that the corresponding child process is still alive. btw, there is no need to call tarcmd.poll() in a loop. You could block until it exits using tarcmd.wait().

It is less error-prone to emulate the shell pipeline:

#!/usr/bin/env python
from subprocess import check_call

check_call('set -e -o pipefail; tar -cvf - /home | gzip -c > tarball.tar.gz', 
           shell=True, executable='/bin/bash')

by reversing the process initialization order:

#!/usr/bin/env python
from subprocess import Popen, PIPE

with open('tarball.tar.gz', 'wb', 0) as tarball_file:
    gzip = Popen(['gzip', '-c'], stdin=PIPE, stdout=tarball_file)
tar = Popen(['tar', '-cvf', '-', '/home'], stdout=gzip.stdin)
gzip.communicate()
if tar.wait() != 0 or gzip.returncode != 0:
    raise CalledProcessError

It might be easier either to use shell=True (if the command is constructed from a trusted input such as a string literal in your source file) or use a library such as plumbum to run the pipeline instead of implementing it on top of Popen directly.

#!/usr/bin/env python
from plumbum.cmd import gzip, tar

(tar['-cvf', '-', '/home'] | gzip['-c'] > 'tarball.tar.gz')()

See How do I use subprocess.Popen to connect multiple processes by pipes?

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670