7

I'm perplexed about how best to use pip in the face of security concerns about malicious packages or install scripts. I'm not much of a security expert, so I may just be confused (bear with me), but it seems that there are 4, possibly overlapping, approaches:

(1) Use sudo pip for everything

This is how I do things now. I generally do not need virtualenvs and like the convenience of having all my packages work for all my tools. I also don't install a lot of experimental packages, sticking pretty much to the well-known and widely used ones (matplotlib, six, etc).

I gather this can be a risky approach though because the installation process has su privileges, and could potentially do anything; however it has the advantage of protecting the site-packages directory from subsequent mischief by anything (not just packages) running as non-su after an install.

This approach also can't be completely avoided, as some packages (pip itself) need it to bootstrap any Python installation.

(2) Create a pip user and give it ownership of site-packages

This would seem to have the advantage of restricting what pip can do: all it can do is install to site-packages. But I'm not sure about side effects, or if it would even work (when, for example pip needs to put things in other locations). A more realistic variant of this is to set things up this way, and use pip as "pip-user" when it works, and as su when it doesn't.

(3) Give myself ownership of site-packages

I gather this is a very had idea, but I'm not sure quite why. It would mean that any code I run would be able to tamper with site-packages; but it would mean that malicious install scripts could only damage things I can damage myself anyway.

(4) "Use a virtualenv"

This suggestion comes up a lot, but I don't see how it helps. It seems no different from 3 to me since it creates a site-packages that I own.

Which, if any of these approaches, or combinations of approaches, is best for ensuring that pip does not result in exposing my system? My concern is mostly with my system as a whole, and only secondarily with my Python installation in site-packages (which I can always rebuild if need be).


Part of the problem I have, is that a don't know how to weigh the risks. An example approach, that seems to make sense to my limited understanding is simply to do (1) for the most part, and use a virtualenv (4) for any package that I worry might damage my site-packages. Anything I've installed will still be able to damage anything I have access to, but that seems unavoidable, and at least things I don't have access to will be safe (except during the installation process itself). But I have trouble evaluating whether the protection this affords is worth the risk it creates.

Community
  • 1
  • 1
orome
  • 45,163
  • 57
  • 202
  • 418
  • If you think 3 and 4 are quite the same, I suggest you to read more what a virtualenv is. From the doc (http://docs.python-guide.org/en/latest/dev/virtualenvs/): A Virtual Environment, put simply, is an isolated working copy of Python which allows you to work on a specific project without worry of affecting other projects. – Andrea de Marco Jan 11 '14 at 22:33
  • 1
    @AndreadeMarco: Yes, but if an install script is malicious, it will do damage wherever it wants, won' it? And if something malicious has been installed in `site-packages` it's no different from something malicious installed anywhere — it will run with as me and do whatever damage I could do. – orome Jan 11 '14 at 22:36
  • The key difference being that a virtualenv is "an **isolated** working copy of python" – yuvi Jan 11 '14 at 22:40
  • Probably i miss what means *malicious* for you. – Andrea de Marco Jan 11 '14 at 22:42
  • @AndreadeMarco: Good question. Let's say specifically (a) delete files or (b) transmit the contents of files to a third party. – orome Jan 11 '14 at 22:44
  • 1
    @yuvi: In what sense would a virtualenv be "isolated" from the kind of malicious behavior I describe on the previous comment (either during the installation process, or after)? – orome Jan 11 '14 at 22:45
  • The link you gave explains it - " If you give yourself write privilege to the system site-packages, you're risking that any program [...] can inject malicious code into the system site-packages and obtain root privilege". In other words - if your site-packages is isolated and you're *not* using the **system site-packages**, then there's no risk giving yourself permissions for it (which is exactly what the virtualenv does) – yuvi Jan 11 '14 at 22:52
  • @yuvi: So, for example, when I'm running an install script in a virtualenv, it will be unable to read files elsewhere on my system (say, my home folder), or delete them? How does that work? Doesn't a virtualenv only protect the system `site-packages` (and then only if it is root-owned)? – orome Jan 11 '14 at 22:54
  • I have to say, I'm definitly not an expert on security, and I'm not completely sure how virtualenv works, but this is how I understand it - Say you create your virtualenv on your desktop and give yourself permission to its folder - even if a malicious program got in, it still wouldn't have permissions to the system files, right? That's not the case with the system site-packages, where a malicious program can gain root privilages and do whatever it wants – yuvi Jan 11 '14 at 22:59
  • @yuvi: So no different from (3), right? – orome Jan 11 '14 at 23:07
  • It is different. The virtualenv you create has its own site-packages and its own pip, by installing inside it you're preventing from any malicious programs you accidentaly install from accessing your system site-packages, and thus it cannot gain root privilages and "only damage what you can damage" – yuvi Jan 11 '14 at 23:09

1 Answers1

3

You probably want to look at using a virtualenv. To quote the docs:

Virtualenv is a tool to create isolated Python environments. The basic problem being addressed is one of dependencies and versions, and indirectly permissions.

Virtualenv will create a folder with an isolated copy of python, an isolated pip and an isolated site-packages. You're thinking that this is the same as option 3 because you're taking that advice you linked at face value and not reading into it:

If you give yourself write privilege to the system site-packages, you're risking that any program that runs under you (not necessarily python program) can inject malicious code into the system site-packages and obtain root privilege.

The problem is not with having access to site-packages (you have to have privilages for site-packages to be able to do anything). The problem is with having access to the system site-packages. A virtual environment's site-packages does not expose root privilages to malicious code the same as the one that your entire system is using.

However, I see nothing wrong with using sudo pip for well known and familiar packages. At the end of the day, it's like installing any other program, even non-python. If you go to its website and it looks honest and you trust it, there's no reason not to sudo.

moreover, pip is pretty safe - it uses https for pypi and if you --allow-external it will download packages from third-party, but will still keep checksums on pypi and compare them. For third-party with no checksum you need to explicitly call --allow-unverified which is the only option considered unsafe.

As a personal note, I can add that I use sudo pip most of the times, but as a WEB developer virtualenv is kind of a day-to-day thing, and I can recommend using it as well (especially if you see anything sketchy but you still want to try it out).

Community
  • 1
  • 1
yuvi
  • 18,155
  • 8
  • 56
  • 93
  • How, specifically, is that different from option (3)? That's what's not clear to me. – orome Jan 11 '14 at 23:24
  • So when I `pip` in the system `site-packages` (e.g., on OSX `/Library/Python/X.X/site-packages/`, even if I own the directory, a malicious install script could acquire root privileges? And this is not the case when I `pip` in a virtualenv? – orome Jan 11 '14 at 23:37
  • Yes, exactly. That's why the system pip needs sudo privilages to work and the virtual one doesn't – yuvi Jan 11 '14 at 23:40
  • It would still need `sudo` if I took ownership `site-packages`? That's now how I was interpreting [the suggestion](http://stackoverflow.com/a/21056100/656912). – orome Jan 11 '14 at 23:42
  • No no no. Open up the virtualenv somewhere you already have access to as a non-root user. On your desktop, in your documents directory, etc. – yuvi Jan 11 '14 at 23:46
  • So it's no different from 3 then. If I take over ownership of the system `site-packages`, and `pip` as me, it's just like creating a new `site-packages` as me, which is what virtualenv does (unless it does more than you've said so far). – orome Jan 12 '14 at 00:37
  • @raxacoricofallapatorius There's nothing wrong with being the owner of site-packages unless it's the system site-packages – yuvi Jan 12 '14 at 08:57
  • 1
    But how is it different? Other than the fact that I'm obviously installing over the system `site-packages` and could accidentally disrupt the system Python installation, how does it create a different *security* risk for my entire machine. What's different about malicious code running in the system's `site-packages` (that I own) or during installation there, vs. a `site-packages` in another location that I own? – orome Jan 12 '14 at 13:25
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/45051/discussion-between-yuvi-and-raxacoricofallapatorius) – yuvi Jan 12 '14 at 13:30
  • @connorbode that sentence is only dumb if you cut it out of context and completely ignore everything else I wrote. There's nothing constructive about your comment – yuvi Aug 18 '18 at 19:21
  • @yuvi this is a question about security and you're saying there's nothing wrong with using sudo to install stuff. "it's the same as installing any other program, non-python" -- true, but you still NEVER give root access to some random program from the internet. everything else about the answer was fine, but indeed, you've totally ruined it by suggesting `sudo pip install ANYTHING` is a good idea. – connorbode Aug 18 '18 at 23:26
  • @connorbode I never said it was a good idea to install any random program from the internet with sudo. what I said, specifically, is that it was ok for *"well known and familiar packages"*. e.g., I don't see anything wrong with sudo installing packages like Numpy, Django, Requests, etc. – yuvi Aug 29 '18 at 18:57
  • @yuvi they're public repos that anyone can contribute to. it's quite possible that someone can sneak something nasty into them, hence if you have the option don't sudo. well known != safe. (for example, with another package manager, https://medium.com/@vesirin/how-i-gained-commit-access-to-homebrew-in-30-minutes-2ae314df03ab) – connorbode Aug 29 '18 at 22:04
  • @connorbode It's possible, but if you're afraid of that, I explained how to avoid using sudo. That's really the users decision who to trust, and his responsibility when he uses sudo commands (again, not just for python) – yuvi Aug 31 '18 at 16:27