407

I am calling different processes with the subprocess module. However, I have a question.

In the following code:

callProcess = subprocess.Popen(['ls', '-l'], shell=True)

and

callProcess = subprocess.Popen(['ls', '-l']) # without shell

Both work. After reading the docs, I came to know that shell=True means executing the code through the shell. So that means in absence, the process is directly started.

So what should I prefer for my case - I need to run a process and get its output. What benefit do I have from calling it from within the shell or outside of it?

starball
  • 20,030
  • 7
  • 43
  • 238
user225312
  • 126,773
  • 69
  • 172
  • 181
  • 42
    the first command is incorrect: `-l` is passed to `/bin/sh` (the shell) instead of `ls` program [on Unix if `shell=True`](http://docs.python.org/2/library/subprocess.html#subprocess.Popen). String argument should be used with `shell=True` in most cases instead of a list. – jfs Feb 18 '14 at 18:14
  • 3
    re "the process is directly started": Wut? – allyourcode Mar 01 '16 at 22:59
  • 19
    The statement "Both work." about those 2 calls is incorrect and misleading. The calls work differently. Just switching from `shell=True` to `False` and vice versa is an error. From [docs](https://docs.python.org/3/library/subprocess.html#subprocess.Popen): "On POSIX with shell=True, (...) If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself.". On Windows there's [automatic conversion](https://docs.python.org/3/library/subprocess.html#converting-argument-sequence), which might be undesired. – mbdevpl Jun 15 '16 at 06:06
  • See also https://stackoverflow.com/q/59641747/874188 – tripleee Jan 08 '20 at 17:55
  • 1
    @DeusXMachina You are incorrectly restating the two older comments which explain this. `subprocess.run(['ls', '-l'r, shell=True)` ends up running `sh -c 'ls' 'sh' '-l'`. The arguments are not "silently ignored" but you have to know how to handle this. Granted, for most practical purposes, the simplest and mostly correct guidance is, "don't use `shell=True` if you pass in a list of tokens, and vice versa". Windows tolerates this better, but is of course completely outrageous for other reasons. – tripleee Oct 24 '22 at 07:56

7 Answers7

284

The benefit of not calling via the shell is that you are not invoking a 'mystery program.' On POSIX, the environment variable SHELL controls which binary is invoked as the "shell." On Windows, there is no bourne shell descendent, only cmd.exe.

So invoking the shell invokes a program of the user's choosing and is platform-dependent. Generally speaking, avoid invocations via the shell.

Invoking via the shell does allow you to expand environment variables and file globs according to the shell's usual mechanism. On POSIX systems, the shell expands file globs to a list of files. On Windows, a file glob (e.g., "*.*") is not expanded by the shell, anyway (but environment variables on a command line are expanded by cmd.exe).

If you think you want environment variable expansions and file globs, research the ILS attacks of 1992-ish on network services which performed subprogram invocations via the shell. Examples include the various sendmail backdoors involving ILS.

In summary, use shell=False.

Heath Hunnicutt
  • 18,667
  • 3
  • 39
  • 62
  • 3
    Thanks for the answer. Though I am really not at that stage where I should worry about exploits, but I understand what you are getting at. – user225312 Jul 03 '10 at 18:51
  • 97
    If you're careless in the beginning, no amount of worry will help you catch up later. ;) – Heath Hunnicutt Jul 03 '10 at 19:14
  • What if you want to limit max memory of the subprocess? http://stackoverflow.com/questions/3172470/actual-meaning-of-shell-true-in-subprocess – Pramod Feb 24 '13 at 10:49
  • 12
    the statement about `$SHELL` is not correct. To quote subprocess.html: "On Unix with `shell=True`, the shell defaults to `/bin/sh`." (not `$SHELL`) – marcin Feb 11 '16 at 16:27
  • 1
    @user2428107: Yes, if you use backtick invocation on Perl, you're using shell invocation and opening up the same issues. Use 3+ arg `open` if you want secure ways to invoke a program and capture the output. – ShadowRanger Oct 29 '16 at 16:56
  • There is another important difference: on Windows you can't call `*.cmd` or `*.bat` files without using `shell=True` or prepend `cmd /c` – lordscales91 Nov 20 '16 at 11:36
  • 1
    I guess you mean `IFS`? I find nothing about "ILS" related to Sendmail vulnerabilities, while improper handling of `IFS` was a well-known attack vector in early versions of Sendmail. – tripleee Jan 27 '17 at 09:17
  • 1
    @HeathHunnicutt: thanks for the explanation. But what is this ILS attack about. Can you please post a link about it, with more details. I wish to learn more about it. – yogeshagr Feb 07 '17 at 11:07
  • 2
    If user input is not involved, and the script isn't meant to be portable, `shell=True` should be fine, right? – Kevin Feb 12 '17 at 20:28
  • 1
    Windows 10 version 1709 actually has a BASH program, but only works when you enable Windows Subsystem for Linux, which you can do via the optional features dialog. Afterwards, you can just invoke it by typing `bash` in command prompt or launching it via the start menu. Also, it's fully operational so you can use `apt-get` and other cool stuff. – Alex Fanat Apr 11 '18 at 20:34
  • does using shell=True creates any vulnerabilites? – Anish Arya Mar 31 '23 at 10:24
224
>>> import subprocess
>>> subprocess.call('echo $HOME')
Traceback (most recent call last):
...
OSError: [Errno 2] No such file or directory
>>>
>>> subprocess.call('echo $HOME', shell=True)
/user/khong
0

Setting the shell argument to a true value causes subprocess to spawn an intermediate shell process, and tell it to run the command. In other words, using an intermediate shell means that variables, glob patterns, and other special shell features in the command string are processed before the command is run. Here, in the example, $HOME was processed before the echo command. Actually, this is the case of command with shell expansion while the command ls -l considered as a simple command.

source: Subprocess Module

Mina Gabriel
  • 23,150
  • 26
  • 96
  • 124
  • 3
    agree. this is a good example for me to understand what shell=True means. – user389955 Sep 15 '17 at 18:03
  • 11
    *Setting the shell argument to a true value causes subprocess to spawn an intermediate shell process, and tell it to run the command* Oh god this tells it all. Why this answer is not accepted??? why? – pouya Sep 24 '17 at 19:42
  • I think the issue is the first argument to call is a list, not a string, but that gives the error if shell is False. Changing the command to a list will make this work – Lincoln Randall McFarland May 30 '18 at 19:06
  • 1
    Sorry my previous comment went before I was done. To be clear: I often see subprocess use with shell = True and the command is a string, e.g. 'ls -l', (I expect to avoid this error) but subprocess takes a list (and a string as a one element list). To run with out invoking a shell (and the [security issues with that](https://docs.python.org/2/library/subprocess.html#using-the-subprocess-module) ) use a list subprocess.call(['ls', '-l']) – Lincoln Randall McFarland May 30 '18 at 19:27
58

An example where things could go wrong with Shell=True is shown here

>>> from subprocess import call
>>> filename = input("What file would you like to display?\n")
What file would you like to display?
non_existent; rm -rf / # THIS WILL DELETE EVERYTHING IN ROOT PARTITION!!!
>>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...

Check the doc here: subprocess.call()

Richeek
  • 2,068
  • 2
  • 29
  • 37
  • 7
    The link is very useful. As the link stated: _Executing shell commands that incorporate unsanitized input from an untrusted source makes a program vulnerable to shell injection, a serious security flaw which can result in arbitrary command execution. For this reason, the use of shell=True is strongly discouraged in cases where the command string is constructed from external input._ – jtuki Sep 08 '15 at 07:43
  • 2
    Note that you still have to be careful even when `shell=False`. For example, `call(["rm", filename1, filename2])` could behave unexpectedly if `filename` is `"-r"`, for example, or if it is a path like `../../private/path/filename.txt` . Use double dash and make sure the filenames aren't paths that you don't expect. – Flimm Aug 23 '21 at 12:22
46

Executing programs through the shell means that all user input passed to the program is interpreted according to the syntax and semantic rules of the invoked shell. At best, this only causes inconvenience to the user, because the user has to obey these rules. For instance, paths containing special shell characters like quotation marks or blanks must be escaped. At worst, it causes security leaks, because the user can execute arbitrary programs.

shell=True is sometimes convenient to make use of specific shell features like word splitting or parameter expansion. However, if such a feature is required, make use of other modules are given to you (e.g. os.path.expandvars() for parameter expansion or shlex for word splitting). This means more work, but avoids other problems.

In short: Avoid shell=True by all means.

nbro
  • 15,395
  • 32
  • 113
  • 196
27

The other answers here adequately explain the security caveats which are also mentioned in the subprocess documentation. But in addition to that, the overhead of starting a shell to start the program you want to run is often unnecessary and definitely silly for situations where you don't actually use any of the shell's functionality. Moreover, the additional hidden complexity should scare you, especially if you are not very familiar with the shell or the services it provides.

Where the interactions with the shell are nontrivial, you now require the reader and maintainer of the Python script (which may or may not be your future self) to understand both Python and shell script. Remember the Python motto "explicit is better than implicit"; even when the Python code is going to be somewhat more complex than the equivalent (and often very terse) shell script, you might be better off removing the shell and replacing the functionality with native Python constructs. Minimizing the work done in an external process and keeping control within your own code as far as possible is often a good idea simply because it improves visibility and reduces the risks of -- wanted or unwanted -- side effects.

Wildcard expansion, variable interpolation, and redirection are all simple to replace with native Python constructs. A complex shell pipeline where parts or all cannot be reasonably rewritten in Python would be the one situation where perhaps you could consider using the shell. You should still make sure you understand the performance and security implications.

In the trivial case, to avoid shell=True, simply replace

subprocess.Popen("command -with -options 'like this' and\\ an\\ argument", shell=True)

with

subprocess.Popen(['command', '-with','-options', 'like this', 'and an argument'])

Notice how the first argument is a list of strings to pass to execvp(), and how quoting strings and backslash-escaping shell metacharacters is generally not necessary (or useful, or correct). Maybe see also When to wrap quotes around a shell variable?

If you don't want to figure this out yourself, the shlex.split() function can do this for you. It's part of the Python standard library, but of course, if your shell command string is static, you can just run it once, during development, and paste the result into your script.

As an aside, you very often want to avoid Popen if one of the simpler wrappers in the subprocess package does what you want. If you have a recent enough Python, you should probably use subprocess.run.

  • With check=True it will fail if the command you ran failed.
  • With stdout=subprocess.PIPE it will capture the command's output.
  • With text=True (or somewhat obscurely, with the synonym universal_newlines=True) it will decode output into a proper Unicode string (it's just bytes in the system encoding otherwise, on Python 3).

If not, for many tasks, you want check_output to obtain the output from a command, whilst checking that it succeeded, or check_call if there is no output to collect.

I'll close with a quote from David Korn: "It's easier to write a portable shell than a portable shell script." Even subprocess.run('echo "$HOME"', shell=True) is not portable to Windows.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • I thought the quote was from Larry Wall but Google tells me otherwise. – tripleee Mar 15 '16 at 10:20
  • That's high talk - but no technical suggestion for replacement: Here I am, on OS-X, trying to acquire the pid of a Mac App I launched via 'open': process = subprocess.Popen('/usr/bin/pgrep -n ' + app_name, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE) app_pid, err = process.communicate() --- but it doesn't work unless I'll use shell=True. Now what? – Motti Shneor Jul 03 '16 at 12:05
  • There are a ton of questions about *how* to avoid `shell=True`, many with excellent answers. You happened to pick the one which is about *why* instead. – tripleee Jul 03 '16 at 13:45
  • @MottiShneor Thanks for the feedback; added simple example – tripleee Jul 03 '16 at 13:55
  • Perhaps see also [my answer to a general question about `subprocess`](/a/51950538/874188) – tripleee Oct 31 '18 at 19:10
1

Anwser above explains it correctly, but not straight enough. Let use ps command to see what happens.

import time
import subprocess

s = subprocess.Popen(["sleep 100"], shell=True)
print("start")
print(s.pid)
time.sleep(5)
s.kill()
print("finish")

Run it, and shows

start
832758
finish

You can then use ps -auxf > 1 before finish, and then ps -auxf > 2 after finish. Here is the output

1

cy         71209  0.0  0.0   9184  4580 pts/6    Ss   Oct20   0:00  |       \_ /bin/bash
cy        832757  0.2  0.0  13324  9600 pts/6    S+   19:31   0:00  |       |   \_ python /home/cy/Desktop/test.py
cy        832758  0.0  0.0   2616   612 pts/6    S+   19:31   0:00  |       |       \_ /bin/sh -c sleep 100
cy        832759  0.0  0.0   5448   532 pts/6    S+   19:31   0:00  |       |           \_ sleep 100

See? Instead of directly running sleep 100. it actually runs /bin/sh. and the pid it prints out is actually the pid of /bin/sh. After if you call s.kill(), it kills /bin/sh but sleep is still there.

2

cy         69369  0.0  0.0 533764  8160 ?        Ssl  Oct20   0:12  \_ /usr/libexec/xdg-desktop-portal
cy         69411  0.0  0.0 491652 14856 ?        Ssl  Oct20   0:04  \_ /usr/libexec/xdg-desktop-portal-gtk
cy        832646  0.0  0.0   5448   596 pts/6    S    19:30   0:00  \_ sleep 100

So the next question is , what can /bin/sh do? Every linux user knows it, heard it, and uses it. But i bet there are so many people who doesn't really understand what is shell indeed. Maybe you also hear /bin/bash, they're similar.

One obvious function of shell is for users convenience to run linux application. because of shell programm like sh or bash, you can directly use command like ls rather than /usr/bin/ls. it will search where ls is and runs it for you.

Other function is it will interpret string after $ as environment variable. You can compare these two python script to findout yourself.

subprocess.call(["echo $PATH"], shell=True)
subprocess.call(["echo", "$PATH"])

And the most important, it makes possible to run linux command as script. Such as if else are introduced by shell. it's not native linux command

demonguy
  • 1,977
  • 5
  • 22
  • 34
  • 2
    "Of course the concept of environment variable is also introduced by shell program." That's incorrect. Environment variables are a thing without shells. – AKX Oct 26 '21 at 11:42
  • you're right, i use the wrong word to describe it. I change my statement – demonguy Oct 26 '21 at 11:55
  • There is no "above" or "below"; the order of answers on this page depends on each individual visitor's preferences. For example, yours is the top answer for me right now because it's the newest one. – tripleee Oct 24 '22 at 07:16
  • Passing the first argument as a single string _inside a list_ is very confusing here. It works, but I'm tempted to say it probably shouldn't. As repeated in several comments elsewhere on this page, pass a single string with `shell=True`, or a list of tokenized strings without it. Anything else has problems with portability and robustness, as well as understandability. Why would you want to use a list here at all; what did you hope it should mean? And what should it then mean if the list has more than one element? (Hint: It doesn't do that. Unless you sneakily answer "it should be unobvious.") – tripleee Oct 24 '22 at 10:36
  • The shell is not responsible for `PATH` lookups. `subprocess.run(["ls"])` works fine without `shell=True`. The `exec*` system call is responsible for looking up the executable on the `PATH`, and that's what we are basically dispatching here. (Windows is slightly different, but not in this detail; the system call is StartProcess and it accepts a string instead of a list of strings, which is why `subprocess` ends up behaving differently on Windows when it comes to passing a string vs passing a list of strings. But `PATH` lookup works the same, as an OS service, which doesn't require a shell.) – tripleee Oct 24 '22 at 10:41
-3

let's assume you are using shell=False and providing the command as a list. And some malicious user tried injecting an 'rm' command. You will see, that 'rm' will be interpreted as an argument and effectively 'ls' will try to find a file called 'rm'

>>> subprocess.run(['ls','-ld','/home','rm','/etc/passwd'])
ls: rm: No such file or directory
-rw-r--r--    1 root     root          1172 May 28  2020 /etc/passwd
drwxr-xr-x    2 root     root          4096 May 29  2020 /home
CompletedProcess(args=['ls', '-ld', '/home', 'rm', '/etc/passwd'], returncode=1)

shell=False is not a secure by default, if you don't control the input properly. You can still execute dangerous commands.

>>> subprocess.run(['rm','-rf','/home'])
CompletedProcess(args=['rm', '-rf', '/home'], returncode=0)
>>> subprocess.run(['ls','-ld','/home'])
ls: /home: No such file or directory
CompletedProcess(args=['ls', '-ld', '/home'], returncode=1)
>>>

I am writing most of my applications in container environments, I know which shell is being invoked and i am not taking any user input.

So in my use case, I see no security risk. And it is much easier creating long string of commands. Hope I am not wrong.