59

I am using Popen to call a shell script that is continuously writing its stdout and stderr to a log file. Is there any way to simultaneously output the log file continuously (to the screen), or alternatively, make the shell script write to both the log file and stdout at the same time?

I basically want to do something like this in Python:

cat file 2>&1 | tee -a logfile #"cat file" will be replaced with some script

Again, this pipes stderr/stdout together to tee, which writes it both to stdout and my logfile.

I know how to write stdout and stderr to a logfile in Python. Where I'm stuck is how to duplicate these back to the screen:

subprocess.Popen("cat file", shell=True, stdout=logfile, stderr=logfile)

Of course, I could just do something like this, but is there any way to do this without tee and shell file descriptor redirection?:

subprocess.Popen("cat file 2>&1 | tee -a logfile", shell=True)
Gino Mempin
  • 25,369
  • 29
  • 96
  • 135
imagineerThat
  • 5,293
  • 7
  • 42
  • 78
  • related: [Python subprocess get children's output to file and terminal?](http://stackoverflow.com/q/4984428/4279) – jfs Sep 23 '13 at 17:12
  • Related: [How to replicate tee behavior in Python when using subprocess?](https://stackoverflow.com/q/2996887) – tony May 14 '22 at 08:14

3 Answers3

58

You can use a pipe to read the data from the program's stdout and write it to all the places you want:

import sys
import subprocess

logfile = open('logfile', 'w')
proc=subprocess.Popen(['cat', 'file'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in proc.stdout:
    sys.stdout.write(line)
    logfile.write(line)
proc.wait()

UPDATE

In python 3, the universal_newlines parameter controls how pipes are used. If False, pipe reads return bytes objects and may need to be decoded (e.g., line.decode('utf-8')) to get a string. If True, python does the decode for you

Changed in version 3.3: When universal_newlines is True, the class uses the encoding locale.getpreferredencoding(False) instead of locale.getpreferredencoding(). See the io.TextIOWrapper class for more information on this change.

tdelaney
  • 73,364
  • 6
  • 83
  • 116
  • 5
    You could also create a file like object that encapsulates this functionality and then use that in place of `stdout`/`stderr` in the call to `Popen`. – Silas Ray Mar 20 '13 at 21:47
  • 2
    @sr2222 - I like that idea too.... except now that I think about it..., they are operating system pipes, not python objects, so does that even work? – tdelaney Mar 20 '13 at 21:48
  • Why do you call proc.wait() AFTER having referenced proc.stdout? I'm a little confused where to use proc.wait() in general. – imagineerThat Mar 20 '13 at 21:55
  • 3
    @imagineerThis - The code reads stdout until it is closed and then waits for the program to exit. You read before wait so that you don't risk the pipe filling up and hanging the program. You wait after read for the final program exit and return code. If you don't wait, you'll get a zombie process (at least on linux). – tdelaney Mar 20 '13 at 22:16
  • Ok, I think I got the basic idea. Had to let it sit for a while. – imagineerThat Mar 20 '13 at 22:50
  • If I don't use wait(), then what will become a zombie process? The subprocess? – imagineerThat Mar 20 '13 at 22:51
  • 1
    @imagineerThis - yes, the subprocess joins the living dead. In unix/linux, when a process exits, it stays in the pid table until its parent reads its return code via the wait() function or until its parent exits. Run 'ps ax' and it will say 'defunct' with Z (zombie) in the STAT field. With a maximum 64k processes, zombies can kill your system if you have too many wandering the streets. Normally parents babysit their child processes but sometimes you want the children to be independent - so you daemonize them. Look at the python daemon module or google "fork off and die" for details. – tdelaney Mar 21 '13 at 00:00
  • awesome. thanks. I happen to be watching the amc show, the walking dead, right now. love the analogy to zombies. – imagineerThat Mar 21 '13 at 00:16
  • 1
    Doesn't the one line need to be "sys.stdout.write(line)"? – noisygecko Sep 20 '13 at 21:44
  • 6
    you might need `iter(proc.stdout.readline, '')` (due to bug with a read-ahead buffer) and add `bufsize=1` to print lines as soon as they are flushed by the child process. call `proc.stdout.close()` to avoid fd leaks. – jfs Sep 23 '13 at 17:04
  • @J.F.Sebastian I vaguely remember pipe issues being sorted out in the 2.6 time frame and since my usage is idiomatic I think I'll stubbornly stick with it. flushing() is a great idea if delays in writing the log file are intolerable, but is resource intensive and should be use only when needed (the users 'cat' example doesn't, btw). Finally, file handles close automatically when processes exit - explicit closes are not needed for scripts that keep files open until exit. – tdelaney Sep 24 '13 at 04:58
  • 2
    @tdelaney: no, it is not fixed. try the script: [`import time; print(1); time.sleep(1); print(2)`](http://ideone.com/8KFawl). Your version won't print `1` until the script exits. The word `flush` in my comment refers to buffers inside a *child process* that you have no direct control over. If the child doesn't flush its stdout then the output will be delayed. It might be fixed using [`pexpect`, `pty` modules](http://stackoverflow.com/a/12471855/4279) or [`stdbuf`, `unbuffer`, `script` commands](http://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe). – jfs Sep 24 '13 at 07:19
  • @J.F.Sebastian - wait, you were referring to some read-ahead bug supposedly worked-around by iter - the pty issue is completely different, exists for the 'tee' example given by the user as well as your work-around. And its not a bug, its the way the system is supposed to work. If you want a line buffered pipeline, you need to play the pty game but that's not the default. – tdelaney Sep 24 '13 at 15:19
  • @tdelaney: yes, it is *two* independent issues. `iter` workarounds the bug (just click the first link in my previous comment, to see it yourself). `pty` solves different problem (you should understand it if your read the last two links in the previous comment) – jfs Sep 25 '13 at 03:40
  • 1
    @tdelaney: I agree with @J.F.Sebastian that there's an issue here, at least for Python 2. See http://bugs.python.org/issue3907 for the relevant bug report, which is closed as 'works for me', with the advice to use the newer `io` module in Python 2.6 and later. Unfortunately, even in Python 2.7 the subprocess module doesn't use the `io` module for its `PIPE`s; as far as Python is concerned they're just `file` objects, and are still subject to the same difference between `for line in my_pipe` and `for line in iter(my_pipe.readline, '')`. – Mark Dickinson Nov 12 '13 at 11:35
  • @Silas Ray - how do you "create a file like object that encapsulates this functionality and then use that in place of stdout/stderr in the call to Popen"? I wan to my Popen (sub)process to run asynchronously from my main code path, but still write to 2 (or more) different files (and not just the terminal) concurrently. – Johnny Utahh Apr 07 '15 at 05:17
  • 1
    @JohnnyUtahh - Silas's suggestion doesn't work. The child process writes to the file descriptor, not any `read` or `write` on the file object. You can take my example, make it into a function and call it as a thread. – tdelaney Apr 07 '15 at 17:57
  • 1
    With python 3.4, I had to replace both `write(line)` by `write(line.decode('utf-8'))` to remove error `TypeError:must be str,not bytes` – jpo38 Jan 04 '16 at 11:36
  • 1
    @jpo38 Thanks for the note. The post could use a 3.x update. The pipes will return bytes that need decoding unless you set `universal_newlines=True`. I'll update the answer. – tdelaney Jan 04 '16 at 18:12
  • @jpo38 The slightly better fix is to replace write(line) with write(line.decode(sys.stdout.encoding)). That guarantees to match the encoding of the string with stdout. Of course, universal_newlines=True makes it all moot. – DoomGoober Oct 06 '20 at 02:08
  • @DoomGoober - Good point. Its not clear to me which encoding to pick. The program is writing to a pipe. Its reasonable to think that it comes up with the same encoding that python does, but then maybe not. – tdelaney Oct 06 '20 at 02:24
18

To emulate: subprocess.call("command 2>&1 | tee -a logfile", shell=True) without invoking the tee command:

#!/usr/bin/env python2
from subprocess import Popen, PIPE, STDOUT

p = Popen("command", stdout=PIPE, stderr=STDOUT, bufsize=1)
with p.stdout, open('logfile', 'ab') as file:
    for line in iter(p.stdout.readline, b''):
        print line,  #NOTE: the comma prevents duplicate newlines (softspace hack)
        file.write(line)
p.wait()

To fix possible buffering issues (if the output is delayed), see links in Python: read streaming input from subprocess.communicate().

Here's Python 3 version:

#!/usr/bin/env python3
import sys
from subprocess import Popen, PIPE, STDOUT

with Popen("command", stdout=PIPE, stderr=STDOUT, bufsize=1) as p, \
     open('logfile', 'ab') as file:
    for line in p.stdout: # b'\n'-separated lines
        sys.stdout.buffer.write(line) # pass bytes as is
        file.write(line)
Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • 2
    you should mention that you can find the return code in p.returncode after it's done. – kdubs Mar 16 '16 at 22:23
  • 1
    @kdubs: it is unrelated to the question. Why do you think I *"should mention"* it? – jfs Mar 16 '16 at 22:47
  • 6
    while I agree he didn't ask for that, it seems one ought to check the return status. I was hoping to find it here. would seem to make the answer complete. perhaps "should" was strong. – kdubs Mar 17 '16 at 10:46
  • 1
    @kdubs I agree that it is a good idea to check the exit status (that is why `subprocess.check_call()`, `subprocess.check_output()` functions exist which do it for you). I could have added `if p.wait() != 0: raise subprocess.CalledProcessError(p.returncode, "command")` but it would distract from the main point: how to emulate `tee` utility in Python. – jfs Mar 17 '16 at 11:33
  • 2
    Above Python 3 version: prints on screen after execution not live – Ujjawal Khare Dec 26 '17 at 05:15
  • @UjjawalKhare wrong. it is easy to see the output can be produced while the process is still running. It is likely that you've encountered the explicitly mentioned buffering issues. – jfs Dec 26 '17 at 05:18
  • @jfs below is my script which calles another script using above example and produces entire output after execution completed while expectation is that it should produce live output as well as write in file at the same time. – Ujjawal Khare Dec 26 '17 at 07:00
  • ./My_Tests.py ALL TC 1 : PASSED TC 2 : PASSED Tatal Passed Tests : 2 Tatal Failed Tests : 0 Inside My_Tests.py : with Popen("./tests.py", stdout=PIPE, stderr=STDOUT, bufsize=1) as p, \ open('My_Test.log', 'ab') as file: for line in p.stdout: # b'\n'-separated lines sys.stdout.buffer.write(line) # pass bytes as is file.write(line) – Ujjawal Khare Dec 26 '17 at 07:05
  • @UjjawalKhare if you don't understand that the code in the answer may produce output *while the process is still running*, ask a separate question. – jfs Dec 26 '17 at 07:09
  • @jfs : just posted. please checkthe url : https://stackoverflow.com/questions/47975233/python-subprocess-delayed-in-screen-buffer-while-writing-print-logs-in-file – Ujjawal Khare Dec 26 '17 at 07:23
  • @jfs: Would be looking for your answer, your code does what i want but i dont know how to fix the buffering issues. Please help with question asked in above link – Ujjawal Khare Dec 26 '17 at 07:31
  • @jfk My apologies. Dont take me oterwise. I said tgat for me it prints on screen after execution of script and you said its a buffer issue and now i am keen to learn how to fix this. A guidence from your side shall be highly appriciated. – Ujjawal Khare Dec 26 '17 at 07:43
  • @jfs could you please help me ? – Ujjawal Khare Dec 26 '17 at 08:47
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/161957/discussion-between-ujjawal-khare-and-jfs). – Ujjawal Khare Dec 26 '17 at 08:47
  • I don't know why but i tried above all suggestions and for my Python3.5 its not doing both the jobs simultaneously. It executes the entire code and then in the end prints the logs. Not sure If this is the buffer issue (as claimed by @jfs above) and if yes, then i don't know how to fix so that it can print on terminal as well as write in file simultaneously. I have also posted a question and everyone is welcome to help me fix this. https://stackoverflow.com/questions/47975233/python-subprocess-delayed-in-screen-buffer-while-writing-print-logs-in-file – Ujjawal Khare Dec 26 '17 at 11:26
  • @jfs thanks, your solution was correct i just had to use it as a function and split my own script and then call this function everytime. Due to all tests were part of same file so it was printing all the outputs in the end. But now after splitting my script in chunks your solution works perfectly. 1000 likes for you. – Ujjawal Khare Dec 30 '17 at 19:34
  • @UjjawalKhare here is a solution that prints interactive commands immediately: https://stackoverflow.com/a/52090802/895245 – Ciro Santilli OurBigBook.com Aug 30 '18 at 06:51
  • Love the `with Popen(...) as p` : I didn't know you could put it in a context manager. It's a good trick to know. – Jean-Francois T. Sep 03 '21 at 03:17
5

Write to terminal byte by byte for interactive applications

This method write any bytes it gets to stdout immediately, which more closely simulates the behavior of tee, especially for interactive applications.

main.py

#!/usr/bin/env python3
import os
import subprocess
import sys
with subprocess.Popen(sys.argv[1:], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) as proc, \
        open('logfile.txt', 'bw') as logfile:
    while True:
        byte = proc.stdout.read(1)
        if byte:
            sys.stdout.buffer.write(byte)
            sys.stdout.flush()
            logfile.write(byte)
            # logfile.flush()
        else:
            break
exit_status = proc.returncode

sleep.py

#!/usr/bin/env python3
import sys
import time
for i in range(10):
    print(i)
    sys.stdout.flush()
    time.sleep(1)

First we can do a non-interactive sanity check:

./main.py ./sleep.py

And we see it counting to stdout on real time.

Next, for an interactive test, you can run:

./main.py bash

Then the characters you type appear immediately on the terminal as you type them, which is very important for interactive applications. This is what happens when you run:

bash | tee logfile.txt

Also, if you want the output to show on the ouptut file immediately, then you can also add a:

logfile.flush()

but tee does not do this, and I'm afraid it would kill performance. You can test this out easily with:

tail -f logfile.txt

Related question: live output from subprocess command

Tested on Ubuntu 18.04, Python 3.6.7.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985