25

When using subprocess.Popen(args, shell=True) to run "gcc --version" (just as an example), on Windows we get this:

>>> from subprocess import Popen
>>> Popen(['gcc', '--version'], shell=True)
gcc (GCC) 3.4.5 (mingw-vista special r3) ...

So it's nicely printing out the version as I expect. But on Linux we get this:

>>> from subprocess import Popen
>>> Popen(['gcc', '--version'], shell=True)
gcc: no input files

Because gcc hasn't received the --version option.

The docs don't specify exactly what should happen to the args under Windows, but it does say, on Unix, "If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional shell arguments." IMHO the Windows way is better, because it allows you to treat Popen(arglist) calls the same as Popen(arglist, shell=True) ones.

Why the difference between Windows and Linux here?

Ben Hoyt
  • 10,694
  • 5
  • 60
  • 84
  • It usually is a good idea to include version of Python you are using or at least if it is line 2 or 3. – mloskot Dec 19 '11 at 19:49

3 Answers3

16

Actually on Windows, it does use cmd.exe when shell=True - it prepends cmd.exe /c (it actually looks up the COMSPEC environment variable but defaults to cmd.exe if not present) to the shell arguments. (On Windows 95/98 it uses the intermediate w9xpopen program to actually launch the command).

So the strange implementation is actually the UNIX one, which does the following (where each space separates a different argument):

/bin/sh -c gcc --version

It looks like the correct implementation (at least on Linux) would be:

/bin/sh -c "gcc --version" gcc --version

Since this would set the command string from the quoted parameters, and pass the other parameters successfully.

From the sh man page section for -c:

Read commands from the command_string operand instead of from the standard input. Special parameter 0 will be set from the command_name operand and the positional parameters ($1, $2, etc.) set from the remaining argument operands.

This patch seems to fairly simply do the trick:

--- subprocess.py.orig  2009-04-19 04:43:42.000000000 +0200
+++ subprocess.py       2009-08-10 13:08:48.000000000 +0200
@@ -990,7 +990,7 @@
                 args = list(args)

             if shell:
-                args = ["/bin/sh", "-c"] + args
+                args = ["/bin/sh", "-c"] + [" ".join(args)] + args

             if executable is None:
                 executable = args[0]
David Fraser
  • 6,475
  • 1
  • 40
  • 56
  • That's great, thanks David. I agree about the correct implementation and your patch looks good. Are you in a (better) position than I to submit a Python bug report -- in other words, have you done that before, or shall I look into it? – Ben Hoyt Aug 10 '09 at 21:59
  • 1
    Added http://bugs.python.org/issue6689 - would be good if you could follow it, comment there etc – David Fraser Aug 12 '09 at 08:35
  • Thanks! I've added myself to the nosy list. – Ben Hoyt Aug 12 '09 at 23:20
  • 4
    For reference, the patch was rejected. It may be an idea to look at whether the documentation needs to be amended instead - I'll leave that up to an interested party – David Fraser Jul 21 '10 at 17:49
5

From the subprocess.py source:

On UNIX, with shell=True: If args is a string, it specifies the command string to execute through the shell. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional shell arguments.

On Windows: the Popen class uses CreateProcess() to execute the child program, which operates on strings. If args is a sequence, it will be converted to a string using the list2cmdline method. Please note that not all MS Windows applications interpret the command line the same way: The list2cmdline is designed for applications using the same rules as the MS C runtime.

That doesn't answer why, just clarifies that you are seeing the expected behavior.

The "why" is probably that on UNIX-like systems, command arguments are actually passed through to applications (using the exec* family of calls) as an array of strings. In other words, the calling process decides what goes into EACH command line argument. Whereas when you tell it to use a shell, the calling process actually only gets the chance to pass a single command line argument to the shell to execute: The entire command line that you want executed, executable name and arguments, as a single string.

But on Windows, the entire command line (according to the above documentation) is passed as a single string to the child process. If you look at the CreateProcess API documentation, you will notice that it expects all of the command line arguments to be concatenated together into a big string (hence the call to list2cmdline).

Plus there is the fact that on UNIX-like systems there actually is a shell that can do useful things, so I suspect that the other reason for the difference is that on Windows, shell=True does nothing, which is why it is working the way you are seeing. The only way to make the two systems act identically would be for it to simply drop all of the command line arguments when shell=True on Windows.

Adam Batkin
  • 51,711
  • 9
  • 123
  • 115
  • 1
    There is a shell on Windows too (normally `cmd.exe`), but the comment you quoted above indicates that Python does not actually use it when shell=True (instead, it uses `CreateProcess()` directly). – Greg Hewgill Aug 10 '09 at 05:02
  • 1
    Thanks -- as Greg mentioned, there definitely is a shell on Windows (cmd.exe or the one in COMSPEC). And it is used by Popen (though via CreateProcess) -- see the subprocess.py source. So it still definitely seems to me that subprocess should make them work the same way, to avoid portability pitfalls... – Ben Hoyt Aug 10 '09 at 05:11
  • note: [the docs has been updated since 2010](https://docs.python.org/3/library/subprocess.html#popen-constructor) – jfs Oct 30 '14 at 11:18
-1

The reason for the UNIX behaviour of shell=True is to do with quoting. When we write a shell command, it will be split at spaces, so we have to quote some arguments:

cp "My File" "New Location"

This leads to problems when our arguments contain quotes, which requires escaping:

grep -r "\"hello\"" .

Sometimes we can get awful situations where \ must be escaped too!

Of course, the real problem is that we're trying to use one string to specify multiple strings. When calling system commands, most programming languages avoid this by allowing us to send multiple strings in the first place, hence:

Popen(['cp', 'My File', 'New Location'])
Popen(['grep', '-r', '"hello"'])

Sometimes it can be nice to run "raw" shell commands; for example, if we're copy-pasting something from a shell script or a Web site, and we don't want to convert all of the horrible escaping manually. That's why the shell=True option exists:

Popen(['cp "My File" "New Location"'], shell=True)
Popen(['grep -r "\"hello\"" .'], shell=True)

I'm not familiar with Windows so I don't know how or why it behaves differently.

Warbo
  • 2,611
  • 1
  • 29
  • 23