2

Some of my python projects are tested under continuous integration with a setup like the one described here : "Pretty" Continuous Integration for Python.

I currently use easy_install to install the project, its dependencies and the test tools (nose, coverage).

Sometimes, my builds report as failed because easy_install was not able to download the dependencies due to networking problems: either the internet connection, PyPI or one of the packages download servers is down or doesn't seem to respond.

I would like to prevent my build to fail in such a case by using a local cache of packages: when we cannot download a fresh dependency, we'll use the local one (which should be updated when possible). It's important for me to first try to download a fresh dependency because I want to be alerted as soon as possible that my project break because of an API change in a dependency.

My question is: how can I setup such a cache that doesn't break on networking problems? I first tried to use collective.eggproxy for that problem, but it doesn't capture all errors as far as I known.

Community
  • 1
  • 1
Sylvain Prat
  • 291
  • 3
  • 10
  • I'm looking into this as well, keep us updated on your progress! – monkut Feb 29 '12 at 09:00
  • Collective.eggproxy note: it got moved to github early February 2012: https://github.com/camptocamp/collective.eggproxy , so it should be much easier now to try small fixes (by forking) or reporting bugs and so. And yes, it could very well be that it doesn't catch all networking errors. – Reinout van Rees Feb 29 '12 at 09:30
  • In fact, `collective.eggproxy` works perfectly fine but I did not let enough time for the server daemon to start before trying to use it in `easy_install` – Sylvain Prat Mar 02 '12 at 07:12

4 Answers4

1

Have you considered using pip instead? If so, you could take advantage of its support for alternate package repositories:

http://www.pip-installer.org/en/latest/usage.html#alternate-package-repositories

If you don't want to move away from easy_install, you could try using the --find-links option with easy_install to provide a basic set of links for the packages you care about.

Amber
  • 507,862
  • 82
  • 626
  • 550
  • I can use the --find-links option but how can I keep the local cache updated? – Sylvain Prat Feb 29 '12 at 09:12
  • I would also prefer using a real cache since it speeds up the build because we avoid downloading locally cached packages – Sylvain Prat Feb 29 '12 at 09:14
  • You could use a combination of both the alternative package repository options *and* pip's download cache option to get both the speed and offline-redundancy aspects. – Amber Mar 02 '12 at 08:14
1

I agree with Amber about using pip. pip offers at least 3 options for supporting spotty pypi access:

  1. The alternate package repository flags as described in (-i for index url and --find_links)

  2. A download cache can be specified by using PIP_DOWNLOAD_CACHE. Downloaded files will be cached for later access

  3. Creating a bundle of all of your dependencies.

We have used all three at one point or another. For a long time we were exclusively using 3, but we have since set up our own pypi server using basketweaver

easy_install also supports the --index-url tag which lets you specify another index, e.g. one of the existing pypi mirrors:

easy_install -i http://d.pypi.python.org/simple 
Mark Ursino
  • 31,209
  • 11
  • 51
  • 83
turtlebender
  • 1,907
  • 15
  • 16
  • 2) An internet connection is still required with PIP_DOWNLOAD_CACHE: see http://stackoverflow.com/a/4806458/145583 – Sylvain Prat Feb 29 '12 at 13:51
  • true, which is why we first went with 3 :) – turtlebender Feb 29 '12 at 20:51
  • With 3), I still don't see how you get fresh dependencies when the connection is OK. Furthermore, it seems to me that the bundle would be built on the continuous integration server? Or did I miss something? – Sylvain Prat Mar 01 '12 at 06:48
  • We stored the bundle in git and generated it on our dev boxes. So, the continuous integration environment was dependent on the devs keeping the bundle up to date in git – turtlebender Mar 01 '12 at 13:05
0

If you're going to be running your build server, I'd really suggest having a local cache of pypi for your build. If only so that you don't load/really inflate counts of packages you use for builds that can happen all the time. There was recently a really good blog post on setting one up of all of pypi:

http://aboutsimon.com/2012/02/24/create-a-local-pypi-mirror/

And for a sprint I recently used a trick with pip to setup a local cache of packages needed just for my application:

http://paste.mitechie.com/show/548/

Rick
  • 15,484
  • 5
  • 25
  • 29
  • Having a full PyPI mirror for just one project seems overkill to me (Need 30 GB!!). That's why I just want a cache of the dependencies *I use*. – Sylvain Prat Mar 01 '12 at 06:45
0

I ended up using collective.eggproxy to cache the downloads, but used a startup delay after running collective.eggproxy as a daemon to prevent errors from happening when I try to use easy_install before collective.eggproxy is fully started.

However, the answers suggesting using pip seems to be equally valid to me, but since I already use easy_install and collective.eggproxy, it's easier for me to stick with them.

Sylvain Prat
  • 291
  • 3
  • 10