1

I'm trying to use this one-liner that should print all the lines that are being added to the file /var/log/messages.log.

sudo tail -f /var/log/messages.log | python2 -c 'exec("import sys\n\nfor line in sys.stdin:\n\tprint line")'

For readability, the python code is:

import sys

for line in sys.stdin:
        print line

If I add a single line to /var/log/messages.log, I don't see anything getting printed. But, if I add lots of data, I start getting output.

Is there a defined behaviour as to how often the generator sys.stdin generate data?

PS: My end-goal is to understand the one-liner that outputs the number of lines that are being fed to the python program per second.

EDIT: How is it assumed that interpreter will cross this line if t > e: every one second?

shadyabhi
  • 16,675
  • 26
  • 80
  • 131
  • On my machine the linked one-liner does not work very well. It suffers from precisely the same problem that your code does -- reports are delayed until `file.next()`'s read-ahead buffer is satisfied. – Robᵩ Sep 23 '13 at 15:23
  • That's not really a one liner btw. – Erik Kaplun Sep 25 '13 at 12:21
  • P.S. I've updated my answer to explain why the "one liner" works the way it does; however, it still suffers from the same input buffering issue, at least on my computer, just like what @Robᵩ reported. – Erik Kaplun Sep 25 '13 at 12:28

2 Answers2

2

stdin is buffered.

In Python 2, you can disable buffering by using the -u flag when you start Python, or setting the PYTHONUNBUFFERED environment variable.

There are a few caveats to be on the lookout for, but this answer has the most detail.

Community
  • 1
  • 1
Thomas Orozco
  • 53,284
  • 11
  • 113
  • 116
2

OK, so here's what worked for me:

import sys

while True:
    print sys.stdin.readline()

And start the script with python -u ....

I'll admit that Thomas' link to the other thread helped me find out that .readline() should be used directly in order for -u to have any effect.

Explanation: -u disables process-level buffering of stdin (as in "the standard input" and not the sys.stdin object specifically), and using .readline() instead of for line in sys.stdin avoids the internal buffering of sys.stdin.

UPDATE: As to your question about this one-liner: "How is it assumed that interpreter will cross this line if t > e: every one second?"... the "one liner" under observation is:

import sys, time
l = 0
e = int(time.time())
for line in sys.stdin:
    t = int(time.time())
    l += 1
    if t > e:
        e = t
        print l
        l = 0

time.time() returns the current time in seconds as float; converting it to int basically just rounds it down to full seconds; and the first moment int(time.time()) is greater than e, which was also set to be int(time.time()), is when almost exactly one second has passed.

But the snippet still suffers from the exact same input buffering issue your original snippet; also, it's invoked without the -u flag, so I cannot imagine why it would ever work reliably on any system, unless the buffering semantics on that system were different at both the Python process STDIN level as well as in the implementation of sys.stdin.

Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
  • Part of my problem is also that, how comes that one-liner works perfectly? – shadyabhi Sep 23 '13 at 15:10
  • 1
    Important clue (http://docs.python.org/2/library/stdtypes.html#bltin-file-objects) : "*In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right. However, using seek() to reposition the file to an absolute position will flush the read-ahead buffer.*" – Robᵩ Sep 23 '13 at 15:25
  • 1
    @shadyabhi: `-u` disables process-level buffering of `stdin` (as in "the standard input" and not the `sys.stdin` object specifically), and using `.readline()` instead of `for line in sys.stdin` avoids the internal buffering of `sys.stdin`. – Erik Kaplun Sep 23 '13 at 15:29
  • Thanks guys. You were of great help. – shadyabhi Sep 27 '13 at 06:06