5

Inside a subprocess call, I want to use shell=True so that it does globbing on pathnames (code below), however this has the annoying side-effect of making subprocess spawn a child process (which must then be `communicate()d/ poll()ed/ wait()ed/ terminate()d/ kill()ed/ whatevah).

(Yes I am aware the globbing can also be done with fnmatch/glob, but please show me the 'correct' use of subprocess on this, i.e. the minimal incantation to both get the stdout and stop the child process.)

This works fine (returns output):

subprocess.check_output(['/usr/bin/wc','-l','[A-Z]*/[A-Z]*.F*'], shell=False)

but this hangs

subprocess.check_output(['/usr/bin/wc','-l','[A-Z]*/[A-Z]*.F*'], shell=True)

(PS: It's seriously aggravating that you can't tell subprocess you want some but not all shell functionality e.g. globbing but not spawning. I think there's a worthy PEP in that, if anyone cares to comment, i.e. pass in a tuple of Boolean, or an or of binary flags)

(PPS: the idiom of whether you pass subprocess...(cmdstring.split() or [...]) is just a trivial idiomatic difference. I say tomato, you say tomay-to. In my case, the motivation is the command is fixed but I may want to call it more than once with a difference filespec.)

Community
  • 1
  • 1
smci
  • 32,567
  • 20
  • 113
  • 146
  • 1
    How would you tell the underlying _shell_ that you want only some but not all of its functionality? Until the underlying shell supports it (across a wide enough array of platforms), there's not much support for having an API to access it... – Charles Duffy Mar 26 '12 at 22:24
  • @Charles: I didn't say tell shell. I said ***tell subprocess***. subprocess can fake out with calls to fnmatch instead of full-blown shell, I really don't give a hoot how it achieves it. But this is doable, if painful. – smci Mar 26 '12 at 22:26
  • I read `shell=True` to mean that I want an actual UNIX shell invoked. If you suddenly have `subprocess` *faking* being a shell, you're asking it to figure out what the Right Thing (or expected thing, at least) is on the native shell of every platform Python supports. Keep in mind that different shells support different variants on glob syntax, different forms of redirection, different flow control constructs (as those can certainly be used in one-liners started by `subprocess`)... you'd basically have to implement an actual UNIX shell in Python, and that would be insane. – Charles Duffy Mar 26 '12 at 22:33
  • The intent with `shell=True` is `"give me globbing"`. I know I get some other unwanted (and possibly insecure) behavior, that's the price shell=True forces you to pay (which as I say is in my opinion a design wart). – smci Mar 26 '12 at 22:41
  • 2
    that may be _your_ intent in using `shell=True`, but to assume such as universal seems like a bit of a stretch -- speaking for myself, when I use `shell=True`, I want redirection and all the rest, and I wouldn't use it otherwise. A PEP proposing a standard-library call to evaluate globs in an argument list, with integration with subprocess to allow the cwd against which they're evaluated to be appropriately adjusted, _does_ seem like a good idea; painting it as an alternative to `shell=True` is the only part I find questionable. – Charles Duffy Mar 27 '12 at 00:09
  • @Charles: that's entirely my point: there are multiple separate use cases for saying `shell=True`. I am well aware that there are many other reasons for using shell than globbing. – smci Mar 27 '12 at 21:53

1 Answers1

7

First off -- there's very little point to passing an array to:

subprocess.check_output(['/usr/bin/wc','-l','A-Z*/A-Z*.F*'], shell=True)

...as this simply runs wc with no arguments, in a shell also passed arguments -l and A-Z*/A-Z*.F* as arguments (to the shell, not to wc). Instead, you want:

subprocess.check_output('/usr/bin/wc -l A-Z*/A-Z*.F*', shell=True)

Before being corrected, this would hang because wc had no arguments and was reading from stdin. I would suggest ensuring that stdin is passed in closed, rather than passing along your Python program's stdin (as is the default behavior).

An easy way to do this, since you have shell=True:

subprocess.check_output(
    '/usr/bin/wc -l A-Z*/A-Z*.F* </dev/null',
    shell=True)

...alternately:

p = subprocess.Popen('/usr/bin/wc -l A-Z*/A-Z*.F*', shell=True,
                     stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=None)
(output, _) = p.communicate(input='')

...which will ensure an empty stdin from Python code rather than relying on the shell.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Aha! True, the files exist in some but not all of the globbed subdirectories, I take it for granted subprocess does not do insane things like spawning with no arguments. So how do I make this work without extreme pain? (and other than manually using fnmatch)? I mean, the UNIX equivalent ***just works***. – smci Mar 26 '12 at 22:29
  • @smci no, the UNIX equivalent doesn't Just Work -- it has the exact same problem, at least if the `nullglob` shell option is true (which, admittedly, it isn't by default, for exactly this reason). Also, see my amendment. – Charles Duffy Mar 26 '12 at 22:34
  • Interesting. It still hangs with redirected `* – smci Mar 26 '12 at 22:39
  • @smci can't reproduce that. `subprocess.check_output('wc -l – Charles Duffy Mar 26 '12 at 22:45
  • Sorry, somehow I dropped the brackets in the filespec, doh. *`subprocess.check_output('/usr/bin/wc -l [A-Z]*/[A-Z]*.F* – smci Mar 26 '12 at 22:53
  • @smci sure, I'd be glad to add that. – Charles Duffy Mar 26 '12 at 23:01
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/9341/discussion-between-charles-duffy-and-smci) – Charles Duffy Mar 26 '12 at 23:07
  • 1
    @smci, ...btw, while this answer was a good one, my explanation for the original problem was not (nothing to do with nullglob, everything to do with how arrays are handled with `shell=True`). I've gone back and corrected it. – Charles Duffy Apr 23 '15 at 13:52