If you readline()
from sys.stdin
, passing the rest of it to a subprocess does not seem to work.
import subprocess
import sys
header = sys.stdin.buffer.readline()
print(header)
subprocess.run(['nl'], check=True)
(I'm using sys.stdin.buffer
to avoid any encoding issues; this handle returns the raw bytes.)
This runs, but I don't get any output from the subprocess;
bash$ printf '%s\n' foo bar baz | python demo1.py
b'foo\n'
If I take out the readline
etc, the subprocess reads standard input and produces the output I expect.
bash$ printf '%s\n' foo bar baz |
> python -c 'import subprocess; subprocess.run(["nl"], check=True)'
1 foo
2 bar
3 baz
Is Python buffering the rest of stdin when I start reading it, or what's going on here? Running with python -u
does not remove the problem (and indeed, the documentation for it only mentions that it changes the behavior for stdout
and stderr
). But if I pass in a larger amount of data, I do get some of it:
bash$ wc -l /etc/services
13921 /etc/services
bash$ python demo1.py </etc/services | head -n 3
1 27/tcp # NSW User System FE
2 # Robert Thomas <BThomas@F.BBN.COM>
3 # 28/tcp Unassigned
(... traceback from broken pipe elided ...)
bash$ fgrep -n 'NSW User System FE' /etc/services
91:nsw-fe 27/udp # NSW User System FE
92:nsw-fe 27/tcp # NSW User System FE
bash$ sed -n '1,/NSW User System FE/p' /etc/services | wc
91 449 4082
(So, looks like it eats 4096 bytes from the beginning.)
Is there a way I can avoid this behavior, though? I would like to only read one line off from the beginning, and pass the rest to the subprocess.
Calling sys.stdin.buffer.readline(-1)
repeatedly in a loop does not help.
This is actually a follow-up for Read line from shell pipe, pass to exec, and keep to variable but I wanted to focus on this, to me, surprising aspect of the problem in that question.