0

I have been doing some manual benchmark tests in my shell using the time command. I would like to scale my benchmarks by writing a python script that both automates the tests and affords me access to the time data so that I can record it in the format of my choosing (likely a csv). I see there is the timeit module, but that seems like it is more for benchmarking python code, where what I am trying to benchmark here are programs run in the command line.

This is what I have been doing manually:

time program -aflag -anotherflag

My initial attempt to implement this in a script looks like:

cmnd = ['time', 'program', 'aflag', 'anotherflag']
p = subprocess.Popen(cmnd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate
print out
print err

I can access the output of time just fine – this is delivered to stderr, however I am not getting program's output as expected in stdout. If I remove time from cmnd and change shell=False to True, I then get the program's output in stdout – however obviously not time's output, which is the whole point.

cmnd = ['program', 'aflag', 'anotherflag']
p = subprocess.Popen(cmnd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate
print out
print err

If I add time back to cmnd with shell=True, I get time's output but program doesn't actually run.

How can I get both working?

dongle
  • 599
  • 1
  • 4
  • 17
  • 1
    Please copy-paste an **actuall** short, standalone program that demonstrates your error. Please do not just type it in, but test it, run it, and copy-paste it. Thanks. See http://stackoverflow.com/help/mcve – Robᵩ Oct 16 '14 at 19:51
  • 1
    the `time` command in shell doesn't show the result !! it just show the time `real` ,`os`,`user` ! – Mazdak Oct 16 '14 at 19:51
  • After fixing the syntax errors in your test program, and replacing your `cmnd` with `cmnd = ['time', 'ls', '-s', '-h']`, I get the correct results in both `out` and `err`. [This link](http://ideone.com/ALUs52) contains my program. – Robᵩ Oct 16 '14 at 19:52
  • so whats your program ? is it a python code ? or ... ? – Mazdak Oct 16 '14 at 19:56
  • @Kasra: It's even worse than that; the program `/usr/bin/time` and the shell builtins named `time` in different shells are all different in what they do with the program's output and/or error, which makes it very confusing for novices to write code around the command… – abarnert Oct 16 '14 at 20:17
  • note: it is an error to use `shell=True` and a list argument in this case (like in most cases). 'program' doesn't see its arguments in this case. Either use `shell=False` or pass the shell command as a string. – jfs Oct 16 '14 at 20:31
  • you could [use `ru = os.wait4(p.pid, 0)` + `resource` module to get the data without `/usr/bin/time` command](http://stackoverflow.com/a/22733285/4279). There is [`psutil.Process(p.pid)` that allows to get process info in a portable manner](https://gist.github.com/zed/9859060). – jfs Oct 16 '14 at 20:35
  • @J.F.Sebastian: You don't need `os.wait` for anything; `p.communicate()` (if you remember the parens) will take care of the waiting for you. – abarnert Oct 16 '14 at 21:40
  • @abarnert: os.wait4() is used to get the resource usage info. Follow the links to see the code. – jfs Oct 16 '14 at 21:44
  • @J.F.Sebastian: But it's not necessary. `p.communicate()` waits for the process (with `waitpid` on *nix), `resource.getrusage` gets the resource information. If you want to distinguish between separate child processes, then you need something more complicated; if you just have one, do it the easy way. – abarnert Oct 16 '14 at 23:08
  • @abarnert: you've answered it yourself: `os.wait4()` allows you to pass child's pid explicitly to distinguish between several child processes. – jfs Oct 16 '14 at 23:18
  • @J.F.Sebastian: But it makes things significantly more complicated. You can't call `os.wait4` and also call `p.communicate`, so you have to… well, see my edited answer. If you don't need the extra complexity, don't do it. – abarnert Oct 16 '14 at 23:22
  • it could be done much simpler e.g., redirect to a temporary file if you need the output (unrelated to resource usage). – jfs Oct 16 '14 at 23:31
  • @J.F.Sebastian: Redirecting to a temporary file and then reading that file is not simpler than just calling `resource.rusage`. – abarnert Oct 16 '14 at 23:35
  • It *is* simpler than subclassing Popen and overwriting private methods like in your answer. – jfs Oct 16 '14 at 23:37

1 Answers1

2

Instead of trying to get this to work, why not use the functionality built into Python in the resource module?

import resource
import subprocess

cmd = ['program', 'aflag', 'anotherflag']
p = subprocess.Popen(cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
usage = resource.getrusage(resource.RUSAGE_CHILDREN)
print out
print err
print usage.ru_utime, usage.ru_stime, usage.ru_utime+usage.ru_stime

If you need to distinguish different child processes running simultaneously, getrusage isn't obviously not sufficient. In that case, you need to use wait4 or similar to get per-process resource usage. This makes your use of Popen more complicated. What you'd probably want to do for this case is subclass or fork the subprocess code (but make sure to use subprocess32 backport if you're on 3.1 or earlier to avoid the bugs in communicate—and so that the class actually has the method you want to hook…) and change the _try_wait method to use wait4 instead of waitpid and stash the extra results in, e.g., self.rusage so you can access it later.

I think something like this would work:

import subprocess32

class Popen(subprocess32.Popen):
    def _try_wait(self, wait_flags):
        """All callers to this function MUST hold self._waitpid_lock."""
        try:
            (pid, sts, rusage) = _eintr_retry_call(os.wait4, self.pid, wait_flags)
            if pid == self.pid:
                self.rusage = rusage
        except OSError as e:
            if e.errno != errno.ECHILD:
                raise
            pid = self.pid
            sts = 0
        return (pid, sts)

cmd = ['program', 'aflag', 'anotherflag']
p = Popen(cmd, shell=False, stdout=subprocess32.PIPE, stderr=subprocess32.PIPE)
out, err = p.communicate()
print out
print err
try:
    usage = p.rusage
except AttributeError:
    print 'Child died before we could wait on it, no way to get rusage'        
else:
    print usage.ru_utime, usage.ru_stime, usage.ru_utime+usage.ru_stime
abarnert
  • 354,177
  • 51
  • 601
  • 671