1

I run a subprocess from python like this (not my script):

  with contextlib.redirect_stdout(log_file):
    # ....
    processResult = subprocess.run(args, 
                    stdout=sys.stdout, 
                    stderr=sys.stderr
                    timeout=3600)

and sometimes the process goes crazy (due to an intermittent bug) and dumps so many errors into the stdout/logfile so that it grows to 40Gb and fills up the disk space.

What would be the best way to protect against that? Being a python newbie, I have 2 ideas:

  • piping the subprocess into something like head that aborts it if output grows beyond limit (not sure if this is possible with subprocess.run or do I have to go the low level Popen way)

  • finding or creating some handy IO wrapper class IOLimiter which would throw an error after a given size (couldn't find anything like this in stdlib and not even sure where to look for it)

I suspect there would be some smarter/cleaner way?

inger
  • 19,574
  • 9
  • 49
  • 54

1 Answers1

2

I recently had this problem myself. I did it with the popen method, setting PYTHONUNBUFFERED=1

test_proc = subprocess.Popen(
    my_command,
    universal_newlines=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT,
)

print(time.time(), "START")
# Iterate over the lines of output produced
for out_data in iter(test_proc.stdout.readline, ""):
    # Check whatever I need.
Prune
  • 76,765
  • 14
  • 60
  • 81
  • Thanks.. I was hoping not having to change to use Popen - but this deosn't look bad - but I guess you'd have think about killing the process, also timeouts, etc (and whatever run() is doing). My other concern would be performance - forgot to mention this file ends up being 40G - so would prefer some optimised processing rather than a python loop.. or should I not worry about it? updating the question anyway with clarifications. – inger Jan 29 '19 at 19:36
  • 1
    Yes, depending on your application, there are various considerations to the interaction. This one at least gives you a non-blocking situation and a reasonably efficient loop: remember, this is one end of a pipe, so the loop iterates *only* when the child process writes a line. Otherwise, the parent process is simply waiting. – Prune Jan 29 '19 at 19:40