Capture output including control characters of subprocess

Question

I have the following simple program to run a subprocess and tee its output to both stdout and some buffer

import subprocess
import sys
import time

import unicodedata

p = subprocess.Popen(
    "top",
    shell=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

stdout_parts = []
while p.poll() is None:
    for bytes in iter(p.stdout.readline, b''):
        stdout_parts.append(bytes)
        str = bytes.decode("utf-8")
        sys.stdout.write(str)
        for ch in str:
            if unicodedata.category(ch)[0]=="C" and ord(ch) != 10:
                raise Exception(f"control character! {ord(ch)}")
    time.sleep(0.01)

When running some terminal updating program, such as top or docker pull, I want to be able to catch its entire output as well, even if it is not immediately readable as such.

Reading around How do commands like top update output without appending in the console? for example, it seems it is achieved by control characters. However, I don't receive any of them when reading lines from the process output streams (stdout/stderr). Or is the technology they use different and I cannot catch it from the subprocess?

I rolled back your recent edit (the text is still available from the [revision history](https://stackoverflow.com/posts/70152093/revisions)); you are more than welcome to post that as an answer, but your question should remain strictly a question. — tripleee, Nov 29 '21 at 12:03

tripleee · Accepted Answer · 2021-11-29T12:16:02.793

Many tools adapt their output depending on whether or not they are connected to a terminal. If you want to receive exactly the output you see when running the tool interactively in a terminal, use a wrapper such as pexpect to emulate this behavior. (There is also a low-level pty library but this is tricky to use, especially if you are new to the problem space.)

Some tools also allow you to specify a batch operation mode for scripting; maybe look into top -b (though this is not available e.g. on MacOS).

For the record, many screen control sequences do not consist entirely or even mainly of control characters; for example, the control sequence to move the cursor to a particular position in curses start with an escape character (0x1B), but otherwise consists of regular printable characters. If you really want to process these sequences, probably look into using a curses / ANSI control code parsing library. But for most purposes, a better approach is to use a machine-readable API and disable screen updates entirely. On Linux, a lot of machine-readable information is available from the /proc pseudo-filesystem.

Do you know if there's a way with which I can set the pseudo terminal size for the subprocess? It seems to know them and print control sequences accordingly — Mugen, Nov 29 '21 at 15:37
https://stackoverflow.com/questions/263890/how-do-i-find-the-width-height-of-a-terminal-window is a PHP question but has links to several useful resources. It's a bit of a chicken and egg problem; curses really does try to figure out the size of your screen, but there are some ways you can override that IIRC. — tripleee, Nov 30 '21 at 06:50

score 1 · Answer 2 · answered Nov 29 '21 at 12:17

Salvaged content from reverted edit to question:

Some solution that prints top nicely with the tip from the answer:

import os
import pty
import subprocess
import sys
import time

import select

stdout_master_fd, stdout_slave_fd = pty.openpty()
stderr_master_fd, stderr_slave_fd = pty.openpty()

p = subprocess.Popen(
    "top",
    shell=True,
    stdout=stdout_slave_fd,
    stderr=stderr_slave_fd,
    close_fds=True
)

stdout_parts = []
while p.poll() is None:
    rlist, _, _ = select.select([stdout_master_fd, stderr_master_fd], [], [])
    for f in rlist:
        output = os.read(f, 1000)  # This is used because it doesn't block
        sys.stdout.write(output.decode("utf-8"))
        sys.stdout.flush()
    time.sleep(0.01)

You really don't want or need `shell=True` here. Switch to `['top']` and drop the `shell=True`. See also [Actual meaning of `shell=True` in subprocess](https://stackoverflow.com/questions/3172470/actual-meaning-of-shell-true-in-subprocess) — tripleee, Nov 29 '21 at 12:18

Capture output including control characters of subprocess

2 Answers2