11

I want to ensure that os.environ and sys.path are identical for all ways we start the Python interpreter:

  • web requests via Django, and Apache mod_wsgi
  • Cron jobs
  • Interactive logins via ssh
  • Celery jobs
  • Jobs started via systemd

Is there a common way to solve this?

If yes, great: How does it look like?

If no, sad: Everybody solves this on his own. ... What is a good way to solve this?

Operating System: Linux (with systemd support)

Update

More explicit:

  1. I want sys.path to be the same in web requests, cron jobs, python started from shell, ...
  2. I want os.environ to be the same in web requests, cron jobs, python started from shell, ...

Update2

For systemd we use EnvironmentFile

Update3

We use virtualenv

Community
  • 1
  • 1
guettli
  • 25,042
  • 81
  • 346
  • 663
  • @Keith I updated the question: – guettli Feb 08 '16 at 09:18
  • You should fix the question, as @Software Mechanic mention: "I'm going to assume you meant os.environ['PYTHONPATH'] == sys.path". – jgomo3 Feb 10 '16 at 12:11
  • @jgomo3 I updated the question. I want sys.path *and* os.environ to be identical. Sorry os.environ['PYTHONPATH'] == sys.path was not on my mind. – guettli Feb 10 '16 at 12:42
  • 1
    @guettli Why do you want these to be the same? This sounds like an attempt to solve another problem, perhaps you're having "ImportError"s or "File Not Found"-type errors from the various ways your programs are started? _Many_ of those issues can be solved by using a common virtual environment for your program. – Seth Feb 13 '16 at 19:28
  • Looks like you want a Container like LXC or Docker for each of your applications. – jgomo3 Feb 15 '16 at 02:12
  • @Seth I updated the question. We already use virtualenv. Yes, we sometimes see ImportErrors, File-Not-Found errors and things like this. Sometimes they were related to different sys.path or os.environ. That's why I want it to be the same. – guettli Feb 22 '16 at 21:02
  • @jgomo3 we already use virtualenv. Things work very well with this lightweight virtualization. I don't see benefits using LXC or docker. – guettli Feb 22 '16 at 21:04
  • @guettli. LXC and docker are also light virtualization, but not limited to Python. They are simply a sophisticated chroot. So you can have your "virtualenvironment" for everything: C libs, Python run time, binaries, etc; as easy as pyenv. So, if sombody is using containers, maybe should not be using virtualenvs. But it has not any advantage over something that is working well, as your case. – jgomo3 Feb 23 '16 at 12:59
  • @jgomo3 ok, and how do I ensure that inside the container `os.environ` and `sys.path` is equal for web-requests, cron, daemons started by systemd, shell, .....? – guettli Feb 24 '16 at 12:06
  • @guetti: disclaimer: I'm not an frequent user of containers, just speaking common knowledge. The idea of containers is to run only one process inside the container: one for each daemon for example. So, you should be able set the environment at the time you start a container whatever you want. With respect to sys.path, you can make it whatever you want inside the sitecustomize module: you can define it the same for each container. It is just an idea. – jgomo3 Feb 24 '16 at 13:13

4 Answers4

6

You can use envdir python port (here is the original) for managing the environment variables.

If you are only concerned about Django, I suggest using envdir from your settings.py programmatically

You can update the environment programmatically (e.g.: in the wsgi file, django's manage.py, settings.py, etc.)

import envdir
import os

# print os.environ['FOO']  # would raise a KeyError

path = '../envdir/prod'
if not os.path.isdir(path):
    raise ValueError('%s is not a dir' % path)
envdir.Env(path)
print os.environ['FOO']

or you can run the your process through envdir on the command line, e.g.: envdir envs/prod/ python manage.py runserver

I suggest creating aliases for python, pip, etc. (as you don't want to overwrite the system's own python), e.g.: alias python-mycorp="envdir /abs/path/to/envs/prod/ python" (or if you prefer, write a full shell script instead of an alias).

zsepi
  • 1,572
  • 10
  • 19
  • I have not heard of `envdir` before. This solves the `os.environ` part of the question. Thank you very much. – guettli Feb 10 '16 at 12:39
  • 1
    I think the `sys.path` part could be solved by `envdir` as well by setting the `PYTHONPATH` environment variable. But I think `virtualenv` is the tool for handling it cleanly. – Dag Høidahl Feb 12 '16 at 09:34
  • I guess the original docs from "D. J. Bernstein" were written for math-freaks, not human beings :-) – guettli Feb 12 '16 at 10:09
  • For systemd we use EnvironmentFile https://www.freedesktop.org/software/systemd/man/systemd.exec.html#EnvironmentFile= A directory and a file for each env-var could be better. But maybe to much in our context. – guettli Feb 12 '16 at 10:13
2

This mapping is captured the first time the os module is imported, typically during Python startup as part of processing site.py. Changes to the environment made after this time are not reflected in os.environ, except for changes made by modifying os.environ directly.

They all have to use the same interpreter. If they launch by the same user, they probably are.

Aviah Laor
  • 3,620
  • 2
  • 22
  • 27
  • I quote: "If they launch by the same user, they probably are.". No, sorry. They are not all equal. They are very different. – guettli Feb 05 '16 at 14:55
  • Running python by the same user, on another shell, will invoke a different interpreter? – Aviah Laor Feb 05 '16 at 20:05
  • The interpreter is the same, but `os.environ` and `sys.path` are different. – guettli Feb 06 '16 at 12:38
  • The home directory will be the same. I use a .pth file in dist-packages, which is also available to all. So everything may not be identical, but the variables that I want to use are available and identical. – Aviah Laor Feb 06 '16 at 14:15
2

As you can see in the documentation of sys.path, it is initialized with the environment variable PYTHONPATH and then with an installation dependent default (site). So, they are intended to be different.

But, you can use the -S option during the interpreter invocation: python -S script.py in order to skip some site specific configuration hook. Nevertheless, you will still have the standard library stuff in your sys.path.

If you really really want os.path['PYTHONPATH'] == sys.path, you should do it explicitly, as the documentation says:

A program is free to modify this list for its own purposes

The standard places to put those kind of specific manipulations are:

  • A sitecustomize module, typically created by a system administrator in the site-packages directory, which can do arbitrary configurations.
  • A usercustomize module, which intention is the same as sitecustomize but only executed if ENABLE_USER_SITE is true.
  • Customization to the sys.path directly from the script. I.e: sys.path = os.env['PYTHONPATH'].
jgomo3
  • 1,153
  • 1
  • 13
  • 26
-2

I'm going to assume you meant os.environ['PYTHONPATH'] == sys.path , because otherwise I can't understand the question. Anyway, the solution would be to use virtualenvs.

  1. Setup a virtualenv
  2. Edit the /bin/activate and add entry PYTHONPATH=your-sys-path.
  3. Make sure your mod_wsgi, celery, cron jobs and shell login(bash_login?) all activate the virtualenv when they are started and use the virtualenv/bin/python for execution.

Done.

Software Mechanic
  • 974
  • 2
  • 9
  • 25
  • We run virtualenvs, and still sys.path is different. – guettli Feb 09 '16 at 19:51
  • Do you have the PYTHONPATH variable set in activate script of the virtualenv?? – Software Mechanic Feb 10 '16 at 08:32
  • @SoftwareMechanic5 For the wsgi part of the question: We activate the virtualenv according to the docs: http://modwsgi.readthedocs.org/en/develop/user-guides/virtual-environments.html IIRC the active script does not get executed. – guettli Feb 10 '16 at 08:42
  • @guettli: For the wsgi part, perhaps, you could try this. https://gist.github.com/GrahamDumpleton/b380652b768e81a7f60c – Software Mechanic Feb 10 '16 at 08:52