The smallest possible change is to use bash, explicitly. shell=True
uses /bin/sh
, which isn't guaranteed to support process substitutions.
Passing filenames out-of-band from the shell command avoids security bugs: It means hostile or unexpected filenames can't run arbitrary commands. (Similarly, preferring sort <"$1"
to sort "$1"
means that a filename that starts with a dash can't have unexpected behaviors).
subprocess.Popen([
'bash', '-c', 'comm -13 <(sort -u <"$1") <(sort -u <"$2") >"$3"',
'_', # this becomes $0
'9-21-pull/animals.tsv', # this becomes $1
'full-data/9-28-full-data/animals.tsv', # this becomes $2
'9-28-pull/animals.tsv' # this becomes $3
])
Now, what if you don't want to use a shell at all? I'm going to leave in the separate uniq
commands here just to show a more complicated case.
#!/usr/bin/env python3
import subprocess
input1='9-21-pull/animals.tsv' # I'm using a different file when testing, ofc
input2='full-data/9-28-full/animals.tsv'
outfile = '9-28-pull/animals.tsv'
sort_input1 = subprocess.Popen(['sort'],
stdin=open(input1, 'r'),
stdout=subprocess.PIPE)
uniq_input1 = subprocess.Popen(['uniq'],
stdin=sort_input1.stdout,
stdout=subprocess.PIPE)
sort_input2 = subprocess.Popen(['sort'],
stdin=open(input2, 'r'),
stdout=subprocess.PIPE)
uniq_input2 = subprocess.Popen(['uniq'],
stdin=sort_input2.stdout,
stdout=subprocess.PIPE)
comm_proc = subprocess.Popen(['comm', '-13',
f'/dev/fd/{uniq_input1.stdout.fileno()}',
f'/dev/fd/{uniq_input2.stdout.fileno()}'],
stdout=open(outfile, 'w'),
pass_fds=(uniq_input1.stdout.fileno(),
uniq_input2.stdout.fileno()))
sort_input1.stdout.close()
uniq_input1.stdout.close()
sort_input2.stdout.close()
uniq_input2.stdout.close()
comm_proc.wait() # let the pipeline run