The code below is a toy example of the actual situation I am dealing with1. (Warning: this code will loop forever.)
import subprocess
import uuid
class CountingWriter:
def __init__(self, filepath):
self.file = open(filepath, mode='wb')
self.counter = 0
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, traceback):
self.file.close()
def __getattr__(self, attr):
return getattr(self.file, attr)
def write(self, data):
written = self.file.write(data)
self.counter += written
return written
with CountingWriter('myoutput') as writer:
with subprocess.Popen(['/bin/gzip', '--stdout'],
stdin=subprocess.PIPE,
stdout=writer) as gzipper:
while writer.counter < 10000:
gzipper.stdin.write(str(uuid.uuid4()).encode())
gzipper.stdin.flush()
writer.flush()
# writer.counter remains unchanged
gzipper.stdin.close()
In English, I start a subprocess, called gzipper
, which receives input through its stdin
, and writes compressed output to a CountingWriter
object. The code features a while
-loop, depending on the value of writer.counter
, that at each iteration, feeds some random content to gzipper
.
This code does not work!
More specifically, writer.counter
never gets updated, so execution never leaves the while
-loop.
This example is certainly artificial, but it captures the problem I would like to solve: how to terminate the feeding of data into gzipper
once it has written a certain number of bytes.
Q: How must I change the code above to get this to work?
FWIW, I thought that the problem had to do with buffering, hence all the calls to *.flush()
in the code. They have no noticeable effect, though. Incidentally, I cannot call gzipper.stdout.flush()
because gzipper.stdout
is not a CountingWriter
object (as I had expected), but rather it is None
, surprisingly enough.
1 In particular, I am using a /bin/gzip --stdout
subprocess only for the sake of this example, because it is a more readily available alternative to the compression program that I am actually working with. If I really wanted to gzip
-compress my output, I would use Python's standard gzip
module.