2

I'm struggling with converting from bash shell to python3.

Here's shell command that I want to convert to python:

cat $outDir/aDir/* | cut -f2 | sort -u > $outDir/outFile.txt

I already use subprocess.call()and it worked but I want to know how make it with Popen().

Here's my code which didn't work :

import subprocess
import glob

filePath = outDir + 'aDir/*'
outFilePath = outDir + '/outFile.txt'

fileList = []
for files in glob.glob(filePath):
    fileList.append(files)
with open(files, 'r') as inFile, open(outFilePath, 'w') as outFile : 
  p = subprocess.Popen(['cat'], stdin=inFile, stdout=subprocess.PIPE)   
  p2 = subprocess.Popen(['cut', '-f2'], stdin = p1.stdout, stdout=subprocess.PIPE)
  p3 = subprocess.Popen(['sort', '-u'], stdin = p2.stdout, stdout = outFile)

and could you explain why shell=True is harmful? I saw it in many answers but don't know why...

Thank you.

Will
  • 24,082
  • 14
  • 97
  • 108
sh kim
  • 183
  • 1
  • 8
  • 1
    http://stackoverflow.com/questions/3172470/actual-meaning-of-shell-true-in-subprocess explains why you want to avoid `shell=True`. – tripleee Jun 03 '16 at 06:06
  • @tripleee: and [this shows that `shell=True` can be useful](http://stackoverflow.com/q/295459/4279) (if there is no untrusted input then it is more likely that one makes a mistake while reimplementing the shell pipeline using `subprocess.Popen` directly than one gets an error due to incompatibilities for a simple shell command). – jfs Jun 04 '16 at 19:57
  • The problem in't that it is useless, but that using it requires understanding. The shell has the unfortunate honor of having proportionally more users who don't know even the absolute basics than even PHP and VBscript combined. (This question is markedly above average in that respect.) – tripleee Jun 05 '16 at 06:16

2 Answers2

2

You need to pass a list of files to cat So

subprocess.Popen(['cat'], stdin=inFile, stdout=subprocess.PIPE)

should become

subprocess.Popen(['cat'] + [fileList], stdout=subprocess.PIPE)

And consequently inFile should no longer be needed

So, all in all

import subprocess
import glob

filePath = outDir + '/aDir/*'
outFilePath = outDir + '/outFile.txt'

fileList = glob.glob(filePath)
with open(outFilePath, 'w') as outFile: 
  subprocess.Popen(['cat'] + [fileList], stdout=subprocess.PIPE)
  p2 = subprocess.Popen(['cut', '-f2'], stdin = p1.stdout, stdout=subprocess.PIPE)
  p3 = subprocess.Popen(['sort', '-u'], stdin = p2.stdout, stdout = outFile)
iruvar
  • 22,736
  • 7
  • 53
  • 82
0

What about just using shell=True and keeping the pipes?

with open(files, 'r') as inFile, open(outFilePath, 'w') as outFile : 
  p = subprocess.Popen('cut -f2 | sort -u', shell=True, stdin=filePath, stdout=subprocess.PIPE)
  p.communicate()

Or even, more simply:

p = subprocess.Popen("cat {} | cut -f2 | sort -u > '{}'".format(filePath, outFilePath), shell=True)
p.communicate()

Or, even more simply (thanks @tripleee!):

subprocess.call("cat {} | cut -f2 | sort -u > '{}'".format(filePath, outFilePath), shell=True)

As for shell=True, the only danger is really if your input is not safe. I'd recommend quoting all inputs with single quotes, and escaping and sanitizing all inputs.

Will
  • 24,082
  • 14
  • 97
  • 108