I am trying to read the output of a subprocess called from Python. To do this I am using Popen (because I do not think it is possible to pipe stdout if using subprocess.call).
As of now I have two ways of doing it which, in testing, seem to provide the same results. The code is as follows:
with Popen(['Robocopy', source, destination, '/E', '/TEE', '/R:3', '/W:5', '/log+:log.txt'], stdout=PIPE) as Robocopy:
for line in Robocopy.stdout:
line = line.decode('ascii')
message_list = [item.strip(' \t\n').replace('\r', '') for item in line.split('\t') if item != '']
print(message_list[0], message_list[0])
Robocopy.wait()
returncode = Robocopy.returncode
and
with Popen(['Robocopy', source, destination, '/E', '/TEE', '/R:3', '/W:5', '/log+:log.txt'], stdout=PIPE, universal_newlines=True, bufsize=1) as Robocopy:
for line in Robocopy.stdout:
message_list = [item.strip() for item in line.split('\t') if item != '']
print(message_list[0], message_list[2])
Robocopy.wait()
returncode = Robocopy.returncode
The first method does not include universal_newlines=True, as the documentation states this is only usable if universal_newlines=True i.e., in a text mode.
The second version does include universal_newlines and therefore I specify a bufsize.
Can somebody explain the difference to me? I can't find the article but I did read about issues with an overflowing buffer causing some sort of issue and thus the importance of using for line in stdout
.
Additionally, when looking at the output, not specifying universal_newlines makes stdout a bytes
object - but I am not sure what difference that makes if I just decode the bytes object with ascii
(in terms of new lines and tabs) compared universal_newlines mode.
Lastly, setting the bufsize
to 1
makes the output "line-buffered" but I am not sure what that means. I would appreciate an explanation about how these various elements tie together. Thanks