0

I am trying to get the following CLI command to work in python.

pgrep fmserverd fmserver_helperd fmslogtrimmer fmxdbc_listener | wc -l

A return of 4 tells me that all 4 processes are running. This works fine in the CLI but does not function correctly in Python. I am doing the following with just one of the processes from the command line:

import subprocess
print subprocess.check_output ([ 'pgrep', 'fmserver_helperd', '|', 'wc', '-l'], shell=True, stderr=subprocess.PIPE)
or
print subprocess.check_output ([ 'pgrep', 'fmserver_helperd', '|', 'wc', '-l'], shell=True)

Which returns:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 573, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['pgrep', 'fmserver_helperd', '|', 'wc', '-l']' returned non-zero exit status 2

If I write a program in a file I basically get the same result. This code protects check_output with a try catch block and the output looks like this:

sp.check_output: ['pgrep', 'fmserver_helperd', '|', 'wc', '-l'] ro: True sh: True out: 
usage: pgrep [-Lfilnoqvx] [-d delim] [-F pidfile] [-G gid]
         [-P ppid] [-U uid] [-g pgrp]
         [-t tty] [-u euid] pattern ...
DoSubProcess exception: Command '['pgrep', 'fmserver_helperd', '|', 'wc', '-l']' returned non-zero exit status 2

I am really stuck on this and would really appreciate your help. TY

Keith
  • 4,129
  • 3
  • 13
  • 25
  • `|` is an instruction to a shell, not an argument to pgrep. Putting that instruction on a command line tells the shell to run multiple programs -- so to simulate it properly you'd be creating multiple `Popen` objects. – Charles Duffy Aug 15 '17 at 20:43
  • That is why I am using shell=True. When that is included I thought the whole thing is sent out as one command? – Keith Aug 15 '17 at 20:43
  • ...and when you pass `shell=True` with a list, only the first element of the list is treated as a script by that shell. So the shell is just running a script containing only `pgrep`. – Charles Duffy Aug 15 '17 at 20:43
  • Why `wc -l` here rather than just counting the number of lines of output in native Python? – Charles Duffy Aug 15 '17 at 20:44
  • I was thinking of that when I was typing this up but subprocess is so strange that I thought I would post the question and learn something in the process. I would be easier if the CLI portion would handle it. Get back to you in a sec after I try. – Keith Aug 15 '17 at 20:47
  • To be clear, `shell=True` just modifies your array of arguments by adding `['sh', '-c']` to the front of it. That's why the first argument is treated as a script -- `-c` means "the next argument is a script to run". – Charles Duffy Aug 15 '17 at 20:50
  • Ug... OK that seams to have worked. Just need to figure how to count the lines. Then I need to do the reverse and bring my processes up again. Thanks Charles. Oh and just saw your other comment... got it and TY again. – Keith Aug 15 '17 at 20:51
  • You **really** shouldn't be using pgrep to ensure that you only have one instance of a program running (or to restart something when it dies). It's not remotely the right tool for the job -- the right tool is a process supervision framework; your OS almost certainly provides one for you (systemd, launchd, upstart, etc), and there are some excellent 3rd-party ones otherwise ([s6](https://skarnet.org/software/s6/why.html), [runit](http://smarden.org/runit/), etc). – Charles Duffy Aug 15 '17 at 20:51
  • Interesting not sure what a process supervision framework is. I'm on mac and windows. The standard CLI stuff is PS awx | grep -e process name which for some reason causes other issues and returns a list of other processes that I did not request in python. Works fine in the CLI but fails in Python. Pgrep is a combination of PS and grep together. If you have other suggestions I'll listen :) – Keith Aug 15 '17 at 20:58
  • oh systemd and upstart is not included, launchd is the reason why I'm writing this watcher. It doesn't wait for the command to complete. Instead it spawns or despawns 6 or 8 processes and returns immediately. – Keith Aug 15 '17 at 21:01
  • Yes, I'm familiar with pgrep (it's actually quite a bit smarter than `ps | grep`). launchd is configurable enough to do... well, whatever you need it to. – Charles Duffy Aug 15 '17 at 21:01
  • BTW, have you considered advisory locking? See https://docs.python.org/2/library/fcntl.html#fcntl.flock – Charles Duffy Aug 15 '17 at 21:02
  • See [Locking a file in Python](https://stackoverflow.com/questions/489861/locking-a-file-in-python), which has some answers portable to Windows. That's going to be vastly more reliable than trying to inspect the process list (which is an innately expensive and race-condition-prone operation). – Charles Duffy Aug 15 '17 at 21:03
  • ...if you have two programs start at the same time, and they both use `pgrep`, they can *both* see the other also present, and can *both* decide to exit or abort. Or if you don't start the program until after the check, they can both see the program not started yet and both decide that they can safely start it. Just say no to race conditions -- use a real locking mechanism. – Charles Duffy Aug 15 '17 at 21:06

1 Answers1

3

If you want to use shell=True with a pipeline, doing that in a safe and robust way (proof against shell injection attacks) might look like:

def count_instances(progname):
    return int(subprocess.check_output(['pgrep "$1" | wc -l', '_', progname], shell=True))

Note that we're passing a valid shell script as the first element of the list. The second element is $0 to that script (in which we're passing a placeholder); the third is $1.


However, that's not particularly good practice when instead you could run:

def count_instances(progname):
    return subprocess.check_output(['pgrep', progname]).count('\n')

...which doesn't require shell=True at all.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • or just `count('\n')` to get the number of lines (saves the splitting) – Jean-François Fabre Aug 15 '17 at 20:52
  • @Jean-FrançoisFabre, *nod* -- the difference is if we have invalid output without a trailing newline, but that shouldn't happen with pgrep. – Charles Duffy Aug 15 '17 at 20:53
  • yes, that's a debate on the last line of a file ending with newline or not: is that a line ? :) – Jean-François Fabre Aug 15 '17 at 20:53
  • Amended per your suggestion. :) – Charles Duffy Aug 15 '17 at 20:54
  • I still don,'t understand why OP command fails (I wouldn't use shell=True & pipe, but still...) – Jean-François Fabre Aug 15 '17 at 20:57
  • @Jean-FrançoisFabre, the OP's original code runs `pgrep` with no arguments at all, which is invalid usage. `pgrep; echo $?` returns 2, the same exit status they're showing. – Charles Duffy Aug 15 '17 at 20:58
  • The failure mode is precisely the same as `sh -c pgrep fmserver_helpd '|' wc -l` at a shell: Only the argument immediately following the `-c` is parsed as a shell script; the other arguments are just stuffed into `$0`, `$1`, etc.; and the script given in the first argument doesn't look at or use them. – Charles Duffy Aug 15 '17 at 21:00
  • @Jean-FrançoisFabre, ...so, to state it differently, `fmserver_helpd` is an argument to `sh`, but it's never passed as an argument to `pgrep`. – Charles Duffy Aug 15 '17 at 21:07
  • I'm finding that this is very unpredictable. I have had some commands work with parameters and using the | (bar) and some not work. I'm finding that the subprocess class needs to be reworked to become more reliable and easily predictable. @CharlesDuffy again thanks so much for the help. Little by little I'll start the understand subprocess which I rely on heavily as a tester. – Keith Aug 15 '17 at 22:25
  • @Keith, I **strongly** disagree with your characterization -- the `subprocess` module is extremely predictable. (Badly-written shell scripts not as much so, but that's a different matter). If you can give me a reproducer for a piece of behavior, I'll be happy to explain where that behavior is coming from. – Charles Duffy Aug 15 '17 at 22:28
  • haha!!! Understood... I'm only partially a CLI geek not 100% yet. I'll think about taking you up on that when I run across them in the future. Hard to go back at this point. – Keith Aug 15 '17 at 22:29
  • To be fair, shell is full of [pitfalls](http://mywiki.wooledge.org/BashPitfalls) and [traps](http://wiki.bash-hackers.org/scripting/newbie_traps) for the unwary (not to mention [obsolete](http://wiki.bash-hackers.org/scripting/obsolete) syntax); backwards compatibility with software written almost half a century ago tends to have that effect. But that's not the `subprocess` module's fault; `subprocess` is very reliable in terms of exactly how it calls other programs -- it's what those other programs do after being called when they happen to be shells that's more likely to be surprising. :) – Charles Duffy Aug 15 '17 at 22:30