12

Where is the buffer in this following ... and how do I turn it off?

I am writing out to stdout in a python program like so:

for line in sys.stdin:
    print line

There is some buffering going on here:

tail -f data.txt | grep -e APL | python -u Interpret.py

I tried the following to shake off possible buffering ... with no luck:

  • as above using the -u flag with python invocation
  • calling sys.stdout.flush() after each sys.stdout.write() call ... all of these create a buffered stream with python waiting something like a minute to print out the first few lines.
  • used the following modified command:

    stdbuf -o0 tail -f data.txt | stdbuf -o0 -i0 grep -e APL | stdbuf -i0 -o0 python -u Interpret.py

To benchmark my expectations, I tried:

tail -f data.txt | grep -e APL 

This produces a steady flow of lines ... it surely is not as buffered as the python command.

So, how do I turn off buffering? ANSWER: It turns out there is buffering on both ends of the pipe.

fodon
  • 4,565
  • 12
  • 44
  • 58

4 Answers4

12

file.readlines() and for line in file have internal buffering which is not affected by -u option (see -u option note). Use

while True:
   l=sys.stdin.readline()
   sys.stdout.write(l)

instead.

By the way, sys.stdout is line-buffered by default if it points to terminal and sys.stderr is unbuffered (see stdio buffering).

ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
6

The problem, I believe is in grep buffering its output. It is doing that when you pipe tail -f | grep ... | some_other_prog. To get grep to flush once per line, use the --line-buffered option:

% tail -f data.txt | grep -e APL --line-buffered | test.py
APL

APL

APL

where test.py is:

import sys
for line in sys.stdin:
    print(line)

(Tested on linux, gnome-terminal.)

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    Yes Yes !!! This is the answer ! Though to be precise, I had to use While True: line=sys.stdin.readline() ... your suggestion as it stands didn't work for me. So, I think it was the buffering on both ends of the pipe that made it hard. ... Thanks. – fodon Dec 07 '11 at 20:37
  • Hm, interesting! What version of Python, and what OS are you using? (My version works for Python2.7+Ubuntu+gnome-terminal). – unutbu Dec 07 '11 at 20:40
  • 2
    Good point. With Python 3, your method works. With python 2.6, it doesn't. – fodon Dec 07 '11 at 20:42
3

The problem is in your for loop. It will wait for EOF before continuing on. You can fix it with a code like this.

while True:
    try:
        line = sys.stdin.readline()
    except KeyboardInterrupt:
        break 

    if not line:
        break

    print line,

Try this out.

0

sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0) and make sure PYTHONUNBUFFERED is set in your environment.

Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
  • nope! That is not it ... should have mentioned that I had already tried it. Also, it seems like PYTHOHNUNBUFFERED is equivalent to setting the -u flag http://docs.python.org/using/cmdline.html#envvar-PYTHONUNBUFFERED – fodon Dec 07 '11 at 15:13
  • 1
    BTW, with python 3, the above line causes the error: ValueError: can't have unbuffered text I/O – fodon Dec 07 '11 at 15:41