1

I am trying to run the following BASH command in Python but I am running into trouble because of all the redirection ("<", ">", "|") and parentheses.

"comm -13 <(sort 9-21-pull/animals.tsv | uniq) <(sort full-data/9-28-full-data/animals.tsv | uniq) > 9-28-pull/animals.tsv"

How do I run this BASH command in Python using sub process?

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93

1 Answers1

0

The smallest possible change is to use bash, explicitly. shell=True uses /bin/sh, which isn't guaranteed to support process substitutions.

Passing filenames out-of-band from the shell command avoids security bugs: It means hostile or unexpected filenames can't run arbitrary commands. (Similarly, preferring sort <"$1" to sort "$1" means that a filename that starts with a dash can't have unexpected behaviors).

subprocess.Popen([
  'bash', '-c', 'comm -13 <(sort -u <"$1") <(sort -u <"$2") >"$3"',
  '_',                                    # this becomes $0
  '9-21-pull/animals.tsv',                # this becomes $1
  'full-data/9-28-full-data/animals.tsv', # this becomes $2
  '9-28-pull/animals.tsv'                 # this becomes $3
])

Now, what if you don't want to use a shell at all? I'm going to leave in the separate uniq commands here just to show a more complicated case.

#!/usr/bin/env python3
import subprocess

input1='9-21-pull/animals.tsv' # I'm using a different file when testing, ofc
input2='full-data/9-28-full/animals.tsv'
outfile = '9-28-pull/animals.tsv'

sort_input1 = subprocess.Popen(['sort'],
                               stdin=open(input1, 'r'),
                               stdout=subprocess.PIPE)
uniq_input1 = subprocess.Popen(['uniq'],
                               stdin=sort_input1.stdout,
                               stdout=subprocess.PIPE)
sort_input2 = subprocess.Popen(['sort'],
                               stdin=open(input2, 'r'),
                               stdout=subprocess.PIPE)
uniq_input2 = subprocess.Popen(['uniq'],
                               stdin=sort_input2.stdout,
                               stdout=subprocess.PIPE)
comm_proc   = subprocess.Popen(['comm', '-13',
                                f'/dev/fd/{uniq_input1.stdout.fileno()}',
                                f'/dev/fd/{uniq_input2.stdout.fileno()}'],
                               stdout=open(outfile, 'w'),
                               pass_fds=(uniq_input1.stdout.fileno(),
                                         uniq_input2.stdout.fileno()))

sort_input1.stdout.close()
uniq_input1.stdout.close()
sort_input2.stdout.close()
uniq_input2.stdout.close()
comm_proc.wait() # let the pipeline run
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • You can also set the executable keyword during subprocess.Popen invocation, e.g. `subprocess.Popen(cmd, shell=True, executable='/bin/bash')`. This may be worth mentioning. – Josh Cooley Oct 01 '20 at 17:18