Subprocess buffer overflow

Question

I use python to test an already compiled binary. The idea is that:

I open the subprocess with a different program (this separate process does nothing and listens to the commands)
Then I send various commands to this subprocess (using mysubprocess.stdin.write())
Then (depending on my need) I validate the output from the subprocess or ignore it

My problem is, that sometimes I'd like to ignore the output. For example, I'd like to send 1M commands to the subprocess, ignore the result (to speed up the simulation time) and check only the output of the last one.

However, it seems (this is my suspicion only!), that I have to consume the stdout buffer. Otherwise, it hangs forever...

Here is an example:

from subprocess import Popen, PIPE

class Simulation:
    def __init__(self, path):
        self.proc = Popen([path], stdin=PIPE, stdout=PIPE)

    def executeCmd(self, cmd):
        self.proc.stdin.write(cmd.encode('utf-8'))
        self.proc.stdin.flush()
        output = ""
        line = ""

        while '</end>' not in line:
            line = self.proc.stdout.readline().decode('utf-8')
            output += line

        return output
        
    def executeCmd_IgnoreOutput(self, cmd):
        self.proc.stdin.write(cmd.encode('utf-8'))
        self.proc.stdin.flush()
        ## self.proc.stdout.read() #< can't do that since subprocess is still running and there is no EOF sign
        self.proc.stdout.flush() #< does not clear the buffer :(

    def tearDown(self):
        self.proc.stdin.write("exit_command")
        self.proc.stdin.flush()
        exit_code = self.proc.wait()
    
simulation = Simulation("\path\to\binary")
output = simulation.executeCmd("command")
#do something with the output

for i in range(1000000):
    simulation.executeCmd_IgnoreOutput("command") #hangs after few thousand iterations
simulation.tearDown()

executeCmd consumes the whole output (I cannot use read since the subprocess is still running and there is no EOF at the end of the output`). But this is very expensive - I have to iterate through all lines...

So my prototype was to create executeCmd_IgnoreOutput which doesn't consume the buffer. But it hangs after a few thousand iterations.

My questions are:

Maybe I made a mistake at the very beginning- is the subprocess package suitable for usage as above? Maybe I should use a different tool for such a purpose...
If so, then how can I clear up the stdout buffer? (flush doesn't work in that case - it still hangs)
Or maybe it hangs for different reasons (any ideas?)

Why do you say "can't [read stdout] since subprocess is still running and there is no EOF sign"? Of course you can read the stream; you'll get all the output the process has written so far. — AKX, Sep 01 '22 at 13:54
When I call `self.proc.stdout.read()` it hangs forever. I read here ( https://stackoverflow.com/a/33886970 ) that this is because of lacking an EOF sign. That's why I read the output line by line and deduce EOF from the output logic. — Patryk, Sep 01 '22 at 14:00
Ah, right – call `read()` with any non-None argument until it doesn't return anything. — AKX, Sep 01 '22 at 14:20

AKX · Answer 1 · 2022-09-01T14:27:36.927

0

To drain the stdout pipe, make sure it isn't buffered, then read it in chunks until there's nothing to read:

import subprocess


class Simulation:
    def __init__(self, path):
        self.proc = subprocess.Popen(
            [path],
            stdin=subprocess.PIPE, 
            stdout=subprocess.PIPE,
            bufsize=0,  # unbuffered
        )

    def executeCmd(self, cmd):
        self.proc.stdin.write(cmd.encode("utf-8"))
        self.proc.stdin.flush()
        output = b""
        while True:
            data = self.proc.stdout.read(65536)
            if not data:
                break
            output += data
        return output

    def tearDown(self):
        self.executeCmd("exit_command")
        return self.proc.wait()

edited Sep 01 '22 at 14:27

answered Sep 01 '22 at 14:22

AKX

152,115
15
115
172

`read` is still blocking in this version. So I experience deadlock. It doesn't work either with 65536 or with 1 byte as an argument. It waits for the EOF :( – Patryk Sep 01 '22 at 14:26
@Patryk Oops, sorry. Try with `bufsize=0` (see edit). – AKX Sep 01 '22 at 14:27
Unfortunately `read` still hangs forever even with `bufsize=0` :-( – Patryk Sep 01 '22 at 14:33
I can think of no good reason why `read()` would hang forever and `readline()` wouldn't. – AKX Sep 01 '22 at 14:34
Well, I use `readline` in a specific way (by validating the payload of this line). So basing on the app logic, I know that the last line contains tag ``. If I validated `line = readline()` against `Empty` I would end up with the same problem. – Patryk Sep 01 '22 at 14:38
No, I mean, `readline()` internally calls `read()` until it finds a newline character. If it doesn't block, then there's no way `read()` could block either, in the same conditions. – AKX Sep 01 '22 at 14:41
The output from the subprocess contains lines ended by `\n` but it doesn't have `EOF` sign. So if you `read` it line by line (or even byte by byte) it will work. As long as the buffer is not empty. When all lines are consumed, it hangs on `read` call. Because the buffer is empty but there is no `EOF`... – Patryk Sep 01 '22 at 15:12

Subprocess buffer overflow

1 Answers1