Python 2 to 3 conversion: iterating over lines in subprocess stdout

Question

I have the following Python 2 example code that I want to make compatible with Python 3:

call = 'for i in {1..5}; do sleep 1; echo "Hello $i"; done'
p = subprocess.Popen(call, stdout=subprocess.PIPE, shell=True)
for line in iter(p.stdout.readline, ''):
    print(line, end='')

This works well in Python 2 but in Python 3 p.stdout does not allow me to specify an encoding and reading it will return byte strings, rather than Unicode, so the comparison with '' will always return false and iter won't stop. This issue seems to imply that in Python 3.6 there'll be a way to define this encoding.

For now, I have changed the iter call to stop when it finds an empty bytes string iter(p.stdout.readline, b''), which seems to work in 2 and 3. My questions are: Is this safe in both 2 and 3? Is there a better way of ensuring compatibility?

Note: I'm not using for line in p.stdout: because I need each line to be printed as it's generated and according to this answer p.stdout has a too large a buffer.

FYI. I tried your code in Ubuntu against Python 2, I get: `Hello {1..5}` — Hai Vu, May 28 '16 at 15:07
Add `executable='/bin/bash'` to the `Popen` call will help to get the correct output. I am still investigating why the failure in Python 3. — Hai Vu, May 28 '16 at 15:24
Thanks @HaiVu, I ran it on a Mac and in CentOS, both with bash, and it worked with no need to specify the executable. When I try it in Ubuntu I see the same as you, no idea why. — foglerit, May 28 '16 at 15:33

score 8 · Accepted Answer · answered May 28 '16 at 17:12

You can add unversal_newlines=True.

p = subprocess.Popen(call, stdout=subprocess.PIPE, shell=True, universal_newlines=True)
for line in iter(p.stdout.readline, ''):
    print(line, end='')

Instead of bytes, str will be returned so '' will work in both situations.

Here is what the docs have to say about the option:

If universal_newlines is False the file objects stdin, stdout and stderr will be opened as binary streams, and no line ending conversion is done.

If universal_newlines is True, these file objects will be opened as text streams in universal newlines mode using the encoding returned by locale.getpreferredencoding(False). For stdin, line ending characters '\n' in the input will be converted to the default line separator os.linesep. For stdout and stderr, all line endings in the output will be converted to '\n'. For more information see the documentation of the io.TextIOWrapper class when the newline argument to its constructor is None.

It's not explicitly called out about the bytes versus str difference, but it is implied by stating that False returns a binary stream and True returns a text stream.

I literally had this problem just 3 days ago in a program I converted to Python 3. — SethMMorton, May 28 '16 at 17:13
Isn't it weird that an option with a name that implies handling of newlines, modifies the format of the whole output (bytes vs str)? Seems like a weird side effect, or bad naming of the argument. Did I misunderstood something? — Gauthier, Feb 09 '22 at 08:23
@Gauthier Yes, it is very weird. In newer versions of Python there is now the alternate name `text` which was added to correct the weirdness. — SethMMorton, Feb 09 '22 at 15:38

score 0 · Answer 2 · answered May 28 '16 at 15:57

0

You can use p.communicate() and then decode it if it is a bytes object:

from __future__ import print_function
import subprocess

def b(t):
    if isinstance(t, bytes):
        return t.decode("utf8")
    return t

call = 'for i in {1..5}; do sleep 1; echo "Hello $i"; done'
p = subprocess.Popen(call, stdout=subprocess.PIPE, shell=True)
stdout, stderr = p.communicate()

for line in iter(b(stdout).splitlines(), ''):
    print(line, end='')

This would work in both Python 2 and Python 3

answered May 28 '16 at 15:57

noteness

2,440
1
15
15

The problem with `communicate` is it will wait till the subprocess is complete before returning the stdout. It this the intended behavior it is a good solution, but if the subprocess is long running or would generate multiple GB of output this may not be desired. – SethMMorton May 28 '16 at 17:19

Python 2 to 3 conversion: iterating over lines in subprocess stdout

2 Answers2