11

Possible Duplicate:
Setting smaller buffer size for sys.stdin?

I have a Python (2.4/2.7) script using fileinput to read from standard input or from files. It's easy to use, and works well except for one case:

tail -f log | filter.py

The problem is that my script buffers its input, whereas (at least in this case) I want to see its output right away. This seems to stem from the fact that fileinput uses readlines() to grab up to its bufsize worth of bytes before it does anything. I tried using a bufsize of 1 and it didn't seem to help (which was somewhat surprising).

I did find that I can write code like this which does not buffer:

while 1:
    line = sys.stdin.readline()
    if not line: break
    sys.stdout.write(line)

The problem with doing it this way is that I lose the fileinput functionality (namely that it automatically opens all the files passed to my program, or stdin if none, and it can even decompress input files automatically).

So how can I have the best of both? Ideally something where I don't need to explicitly manage my input file list (including decompression), and yet which doesn't delay input when used in a "streaming" way.

Community
  • 1
  • 1
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • close the stdin filehandle and reopen it with `buffering = 0` (i haven't tried it, so Im not going to post it as an answer) – tMC May 17 '11 at 16:11
  • 1
    http://stackoverflow.com/questions/3670323/setting-smaller-buffer-size-for-sys-stdin – David May 17 '11 at 16:25
  • You might be mischaracterizing the situation somewhat by saying fileinput uses readlines(). By default, readlines() doesn't return til it hits EOF, whereas 'for line in fileinput.input():' and 'for line in sys.stdin:' will eventually return something when they get enough characters buffered. You could be right that fileinput uses readlines() internally, though, if it passes a bufsize argument. – Don Hatch Feb 03 '16 at 04:09
  • I just filed bug report http://bugs.python.org/issue26290 "fileinput and 'for line in sys.stdin' do strange mockery of input buffering" which includes the behavior you've observed. Summary: fileinput is broken in both 2.7 and 3.4, "for line in sys.stdin:" is broken in 2.7 but fixed in 3.4, readline works properly in both 2.7 and 3.4. – Don Hatch Feb 05 '16 at 02:55

2 Answers2

3

Try running python -u; man says that it will "force stdin, stdout and stderr to be totally unbuffered".

You can just alter the hashbang path at the first line of filter.py.

9000
  • 39,899
  • 9
  • 66
  • 104
  • 1
    `Note that there is internal buffering in xreadlines(), readlines() and file-object iterators ("for line in sys.stdin") which is not influenced by this option.` – tMC May 17 '11 at 16:31
  • Yeah for the reason tMC stated, this doesn't work. I did try it though. – John Zwinck May 17 '11 at 16:35
  • Then don't use line-based I/O. Use plain `stdin.read()`. – 9000 May 17 '11 at 21:31
  • 1
    readline() (singular) works just fine. It's only readlines() (plural) that does the buffering I don't want. I imagine raw read() would work too, but it's not necessary in this case. – John Zwinck May 18 '11 at 14:13
0

Have you tried:

def hook_nobuf(filename, mode):
    return open(filename, mode, 0)

fi = fileinput.FileInput(openhook=hook_nobuf)

Not tested it, but from reading what openhook param does and what passing 0 to open for bufsize param, this should do the trick.

John Gaines Jr.
  • 11,174
  • 1
  • 25
  • 25
  • 1
    This has no effect. Again the problem seems to be that fileinput uses the readlines() method and buffers internally. – John Zwinck May 17 '11 at 17:00
  • Well, I think that's your answer then. Either don't use fileinput, or starting with fileinput.py as a base, rewrite it to not buffer internally. Looking at the code, there doesn't seem to be any way to make it not do at least SOME buffering just by passing parameters to it. – John Gaines Jr. May 17 '11 at 17:32
  • 2
    I'm new to Python; it seems shocking that this use case is not well covered (it seems very natural to write text filters in Python after all, if it weren't for this). – John Zwinck May 17 '11 at 20:11