0

I was wondering how I can pass a python variable to subprocess.check_output command.

In this particular case I have lower and upper python variables to be passed to the subprocess.check_output command, but I'm certain the way I have done it below isn't correct because it's not giving me the expected result.

If I input the values for the lower and upper bound values manually it does work.

 for qq in range (0, 5, 1): 
      lo = glob.glob(path2 + "IM" + path1 + "*_GM.nii.gz")  
      lo = ' '.join(lo)
      lower = qq - 0.5
      upper = qq + 0.5
      subprocess.check_output(['fslstats {} -l lower -u upper -V | cut -d " " -f 1'.format(lo)], shell=True)

Any suggestions how I can pass the lower and upper variables?

Note:

lo= /Users/say/Documents/awIM/network5/awfc_GM.nii.gz

path2=/Users/say/Documents/aw

path1=/network5/awfc

Thanks

hsayya
  • 131
  • 1
  • 10
  • 1
    Do not **ever** use `'...'.format()` or equivalents to generate a string parsed as code; doing so invites major security bugs. – Charles Duffy Jan 25 '19 at 04:36
  • Instead, pass the arguments as separate list elements. `['fslstats "$1" -l ...', '_', str(lo)]` means that `lo` can't contain a value like `$(rm -rf ~)` (all characters valid in filenames, and thus which can be present in a `glob.glob()` result) and wipe your home directory. – Charles Duffy Jan 25 '19 at 04:37
  • ...beyond that, though, it'd do a lot of good if you actually logged your `args`, whether it's a single string, a list, or whatever else; that way you can simplify the question, to be about why your code generates a bad value (if such is the case), or why a specific value you believe to be correct behaves in an unexpected way. – Charles Duffy Jan 25 '19 at 04:39
  • ...right now, we don't know what your `path1` or `path2` variables contain, and we don't know what files exist that match your globs, so I don't see how anyone could possibly provide a definitive answer about what's going wrong (lots of *possible* problems come up on a code read, but there's no certainty about which one is at hand). Log the specific arguments passed to `check_output` in unambiguous (`repr()`'d) form, and then there's no question. – Charles Duffy Jan 25 '19 at 04:40
  • @CharlesDuffy Unfortunately, `shell=True` requires everything be passed as a single string, which will be parsed (dangerously) by a shell. And that's necessary to get the pipe treated as a pipe, not just as an argument to `fslstats` – Gordon Davisson Jan 25 '19 at 04:41
  • @GordonDavisson, untrue -- read the implementation if you doubt. `shell=True` prepends `['sh', '-c']` to the `args` value passed by the user (if what's given for `args` is a string rather than a list, it's coerced to a single-item list); there can be multiple arguments following -- the first one is parsed as a shell script, the second as `$0` in the context of that script, the third as `$1`, etc. Thus, the first element in the list is the only one where shell injection vulnerabilities can take place, unless the script itself contains them (using `eval`, invoking another shell, etc). – Charles Duffy Jan 25 '19 at 04:41
  • @GordonDavisson, ...that said, I'll certainly grant that `shell=True` is *usually* passed just one string, because it's usually used by people who either don't know why that usage pattern is dangerous or don't care, while most folks who *do* care deeply about correctness (and/or know the ins and outs of `subprocess`) typically use `shell=False` with a separate `Popen` instance per pipeline component; but it's entirely possible to use it in a safer way. – Charles Duffy Jan 25 '19 at 04:48
  • @CharlesDuffy Hah, I did not know that! I don't immediately see an answer showing how to do it safely; do you want to write one up or should I? – Gordon Davisson Jan 25 '19 at 04:59
  • @GordonDavisson, more a close-as-dupe situation; we already have this asked and answered elsewhere in the knowledgebase. – Charles Duffy Jan 25 '19 at 05:07
  • @CharlesDuffy Thanks for the suggestion Charles. I'm still quite new to python and thought the only way to call a bash directory was by using glob.glob, now reading your comments I realize it's not. I'm not sure how I can fix this now. I've also added more to my question (to clarify what the variables are, which you were asking about). Again any help to correct this would be greatly appreciated. – hsayya Jan 25 '19 at 15:43
  • @CharlesDuffy Can you please direct me to that? I've tried looking for something to help me figure it out? Or do you have any suggestions of what I could do? – hsayya Jan 25 '19 at 15:48
  • BTW, using `check_output` on your original code, the check was for the exit status of `cut` (being the last thing in the pipeline), not the exit status of `fslstats`, so it wouldn't actually tell you if `fslstats` succeeded or not. – Charles Duffy Jan 25 '19 at 16:14
  • (That's not a Python thing, it's a shell thing; to get bash to reflect anything but the end of a pipeline's exit status, you need to use `set -o pipefail`, but that's a non-POSIX extension and thus not guaranteed to be available with `/bin/sh` as used by `shell=True`). – Charles Duffy Jan 25 '19 at 16:28

1 Answers1

1

Posted Community Wiki because this is a question already asked and answered elsewhere in the knowledgebase.


Doing this correctly (but for the removal of the cut in favor of native-Python string manipulation) might look something like:

glob_str = path2 + "IM" + path1 + "*_GM.nii.gz"
glob_list = glob.glob(glob_str)
if len(glob_list) == 0:
    raise Exception("No results found from glob expression %r" % glob_str)
for qq in range (0, 5, 1):
    lower = qq - 0.5
    upper = qq + 0.5
    args = ['fslstats'] + glob_list + [ '-l', str(lower), '-u', str(upper), '-V' ]
    ### EVERYTHING BELOW HERE IS UNNECESSARILY COMPLICATED BY THE USE OF 'cut'
    ### CONSIDER REPLACING WITH THE ALTERNATE IMPLEMENTATION  BELOW.
    p1 = subprocess.Popen(args, stdout=subprocess.PIPE)
    p1.stdout.close()
    p2 = subprocess.Popen(['cut', '-d', ' ', '-f1'], stdin=p1.stdout)
    (stdout, _) = p2.communicate()
    if p1.wait() != 0:
      raise Exception("fslstats run as %r returned exit status %r" % (args, p1.returncode))
    print("Result is: %r" % (stdout.split("\n"),))

To remove cut, you might change everything below the line assigning args as follows:

    stdout = subprocess.check_output(args)
    first_column = [ line.split()[0] for line in stdout.split('\n') ]
    print("Result is: %r" % first_column)

Note:

  • We're not using shell=True. Keeping this disabled makes for an implementation where you have more control -- a shell isn't doing things behind your back, and you don't need to know how that shell works and how it's implemented to avoid (potentially security-impacting) bugs.
  • To implement a pipeline without shell=True, we're following the practices documented at https://docs.python.org/2/library/subprocess.html#replacing-shell-pipeline
  • We're actually passing the values of the lower and upper variables, instead of passing the command lower and upper strings.
  • We're not joining our glob results into a string (which would break our command if any filenames resulting from that glob contained spaces), but are instead passing the list directly on the argument list for fslstats.
  • Because you care about the exit status of fslstats, not cut, you need to check that yourself. (Even with shell=True, you get default shell behavior, which returns only the exit status of the last pipeline component).
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441