19

According to the Python 3.5 docs, subprocess.run() returns an a CompletedProcess object with a stdout member that contains "A bytes sequence, or a string if run() was called with universal_newlines=True." I'm only seeing a byte sequence and not a string, which I was assuming (hoping) would be equivalent to a text line. For example,

import pprint
import subprocess

my_data = ""
line_count = 0

proc = subprocess.run(
         args = [ 'cat', 'input.txt' ],
         universal_newlines = True,
         stdout = subprocess.PIPE)

for text_line in proc.stdout:
    my_data += text_line
    line_count += 1

word_file = open('output.txt', 'w')
pprint.pprint(my_data, word_file)
pprint.pprint(line_count, word_file)

Note: this uses a new feature in Python 3.5 that won't run in previous versions.

Do I need to create my own line buffering logic, or is there a way to get Python to do that for me?

highpost
  • 1,263
  • 2
  • 14
  • 25
  • 1
    This doesn't even run for me (in 3.4) -- is that expected? – kayleeFrye_onDeck Dec 04 '15 at 23:47
  • 2
    I'm using 3.5. I'll make it more clear. – highpost Dec 04 '15 at 23:48
  • 3
    Are you sure you're getting a byte sequence? You may `print type(proc.stdout)` or something to check. You're iterating over `proc.stdout` as though it's a file. Iterating over an open text file gives you each line. Iterating over a string or byte sequence give you each character/byte. If you want to handle each line of a string individually, you could iterate over `proc.stdout.split('\n')` instead. (Though this won't include newlines in each line like iterating over a file would.) – Jeremy Dec 04 '15 at 23:55
  • 1
    Thanks. Chaining split() to proc.stdout was what I was looking for. – highpost Dec 05 '15 at 00:16

3 Answers3

23

proc.stdout is already a string in your case, run print(type(proc.stdout)), to make sure. It contains all subprocess' output -- subprocess.run() does not return until the child process is dead.

for text_line in proc.stdout: is incorrect: for char in text_string enumerates characters (Unicode codepoints) in Python, not lines. To get lines, call:

lines = result.stdout.splitlines()

The result may be different from .split('\n') if there are Unicode newlines in the string.

If you want to read the output line by line (to avoid running out of memory for long-running processes):

from subprocess import Popen, PIPE

with Popen(command, stdout=PIPE, universal_newlines=True) as process:
    for line in process.stdout:
        do_something_with(line)

Note: process.stdout is a file-like object in this case. Popen() does not wait for the process to finish -- Popen() returns immidiately as soon as the child process is started. process is a subprocess.Popen instance, not CompletedProcess here.

If all you need is to count the number of lines (terminated by b'\n') in the output, like wc -l:

from functools import partial

with Popen(command, stdout=PIPE) as process:
    read_chunk = partial(process.stdout.read, 1 << 13)
    line_count = sum(chunk.count(b'\n') for chunk in iter(read_chunk, b''))

See Why is reading lines from stdin much slower in C++ than Python?

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • There is a typing error. "from subprocess" instead of "from subrocess". But this works for me, thank you. – nhey4kit Nov 30 '22 at 07:51
7

if you need to have STDOUT lines in an array to better manipulate them you simply miss to split output by the "Universal newline" separators

nmap_out = subprocess.run(args = ['nmap', '-T4', '-A', '192.168.1.128'],
                              universal_newlines = True,
                              stdout = subprocess.PIPE)

nmap_lines = nmap_out.stdout.splitlines()
print(nmap_lines)

output is:

['Starting Nmap 7.01 ( https://nmap.org ) at 2016-02-28 12:24 CET', 'Note: Host seems down. If it is really up, but blocking our ping probes, try -Pn', 'Nmap done: 1 IP address (0 hosts up) scanned in 2.37 seconds']
Rider
  • 79
  • 1
  • 3
1

You are seeing a string, compare:

import subprocess
proc = subprocess.run(
    args = [ 'cat', 'input.txt' ],
    universal_newlines = False,
    stdout = subprocess.PIPE)

print (type(proc.stdout))

class 'bytes'

run calls popen.communicate

communicate() returns a tuple (stdout_data, stderr_data). The data will be bytes or, if universal_newlines was True, strings.

Have a look here for more explanation and other shell interactions.

fiacre
  • 1,150
  • 2
  • 9
  • 26