2

I'm trying to simply read from stdin using codecs using the code below

import codecs

ins = codecs.getreader('utf-8')(sys.stdin, 'replace')
for l in ins:
    print l

I have another script that writes to stdout in burst of small data. I need my script to process the data after each burst. However, codecs seems to buffer the data. That means that lines that are written to stdout don't immediately show up in my reader code above. Is there a parameter that I can set to prevent the buffering?

Thanks!

chris
  • 63
  • 7

1 Answers1

2

There are two levels of buffering in this seemingly simple example. To avoid the first level ---and more as a workaround than a solution--- you can read each line and then decode it, rather than the other way around. It works because end-of-line is still unambigously \n in utf-8. (Note: this first piece of code doesn't work because it still has the 2nd level of buffering! It is included for explanation purposes)

for l in sys.stdin:
    l = l.decode('utf-8', 'replace')
    print l

The second level comes from for l in file. So you need actually:

while True:
    l = sys.stdin.readline()
    if not l: break
    l = l.decode('utf-8', 'replace')
    print l
Armin Rigo
  • 12,048
  • 37
  • 48
  • Thanks for your help! The second example worked. I only added a `strip()` to reading the input in order to remove the extra blank line that was introduced. Alternatively, I could have used `sys.stdout.write()` instead of the `print()`. I did not get the first example to work btw. I haven't digged into it though since my issue has been resolved. – chris Jun 19 '13 at 15:02
  • 1
    Added a **note** to my answer. – Armin Rigo Jun 19 '13 at 19:56
  • For some reason I can't seem to find any documentation about the buffering that happens in the `for l in sys.stdin`. Do you happen to have a reference for that? – chris Jun 20 '13 at 14:16
  • http://stackoverflow.com/questions/5076024/in-python-is-read-or-readlines-faster gives some pointers at least. I don't know exactly if it's documented. – Armin Rigo Jun 20 '13 at 23:21