1

I am trying to run simple awk shell command and capture its output (using python2). Here is what I try to do:

import subprocess as sb

shell = ["awk '!/<tag>/ {print \"\\"\"$1\"\\"\", \"\\"\"$2\"\\"\"}' test.txt"]
p = sb.check_output(shell, shell=True)
print p

test.txt content:

a, b, 5
a, c, 3
d, d, 1

I want the to get the following output with awk and store it into a variable:

"a" "b"
"a" "c"
"d" "d"

However I obviously lack the knowledge of how to properly handle double quotes. I tried escaping them with several backsplashes, it all didn't work. How to correctly escape the double quotes so that the example above work?

mklement0
  • 382,024
  • 64
  • 607
  • 775
minerals
  • 6,090
  • 17
  • 62
  • 107
  • Which version of python are you using? – xio4 Dec 05 '14 at 22:42
  • 1
    possible duplicate of [awk commands within python script](http://stackoverflow.com/questions/16675211/awk-commands-within-python-script) – TehTris Dec 05 '14 at 22:44
  • What does test.txt look like? – spicavigo Dec 05 '14 at 22:46
  • 1
    The obvious thing to do here is to not use `shell=True`, and not try to build a command line that quotes the quotes and so on. If you're not using any shell features, why make your life more difficult (and your code less efficient, and less secure, and harder to debug)? – abarnert Dec 05 '14 at 22:47
  • 2
    obviously it is python2, because of `print p` ;) – minerals Dec 05 '14 at 22:47
  • @TehTris you're right, I could not find this page... – minerals Dec 05 '14 at 22:51
  • its worded differently ( because the OP on that post didnt ask the correct question) but same thing. Now we can forever point this question to that one. – TehTris Dec 05 '14 at 22:53
  • I'm not sure exactly what `awk` command line you're trying to run here. Is it `awk '!// {print "\""$1"\"", "\""$2"\""}' test.txt`? Or what? But even that's going to include the commas, which aren't in your desired output… – abarnert Dec 05 '14 at 22:55

1 Answers1

2

When you use shell=True but pass a list, you're asking Python to merge your list of strings together as if they were separate arguments. That means it may do its own quoting, on top of whatever quoting you did, in hopes that the shell will reverse things properly. This is going to be a nightmare to get right. If you want to use shell=True, just pass a string.

But that raises the question of why you're using shell=True in the first place. If you didn't use this, you could just pass a list of arguments, without having to quote any of them to protect them from the shell. Much easier to write, and easier to debug, and more efficient and more secure to boot. Unless you actually need shell features, or you've got a command line that you worked hard to get working and don't want to spend time breaking down into separate arguments, never use the shell.


I'm not actually sure what awk command you're trying to run here. If you give it the double-quotes around $1 and $2 it's just going to print a literal "$1" "$2", because that's what quotes mean to awk. Maybe you wanted something like this?

awk '!/<tag>/ {print "\""$1"\"", "\""$2"\""}' test.txt

In which case:

subprocess.check_output(['awk', r'!/<tag>/ {print "\""$1"\"", "\""$2"\""}', 
                         'test.txt'])

(Note that I used a raw string so I could pass the "\"" literally, without having to backslash the backslash.)

But this still doesn't provide your desired output, because $1 is going to be a,, so "\""$1"\"" is going to be "a,".

abarnert
  • 354,177
  • 51
  • 601
  • 671