1

How do you best pin package versions in R?

Rejected strategy 1: Pin to CRAN source tar.gzs

  • Doesn't work if you want to pin it at the latest version since CRAN does not put the tip version in the archive (duh)

Rejected strategy 2: Use devtools

  • Don't want to, because it takes ages to compile and adds lots of stuff I don't want to use

Rejected strategy 3: Vendor

  • Would rather avoid having to copy all source
salient
  • 2,316
  • 6
  • 28
  • 43
  • What you mean by 'pin' isn't clear, but I think you want packrat https://rstudio.github.io/packrat/ which controls package versioning for a project. – r.bot Jul 28 '17 at 10:18
  • pin as in set to a particular version to be used when building. `packrat` is basically a vendoring lib as I understand it? – salient Jul 28 '17 at 10:40
  • I believe so, I've never actually used it but am aware of it. For that reason I put it as a comment rather than an answer. – r.bot Jul 28 '17 at 10:43
  • There are similar questions already on SO, such as this one which suggests virtual machines as a solution https://stackoverflow.com/questions/8860802/how-to-install-and-manage-many-versions-of-r-packages?rq=1 – r.bot Jul 28 '17 at 10:45
  • well, the destination for my package builds are a docker container :-). I need a way to set it to particular versions to be used in production and for reproducibility. – salient Jul 28 '17 at 10:46
  • I once tried the [miniCRAN](https://github.com/RevolutionAnalytics/miniCRAN) way and it was quite convenient. It adds some maintenance work though.You can either host a network sub-repository with a "frozen" snapshot (as I did) or distribute the package bundle. – tonytonov Jul 28 '17 at 11:12

1 Answers1

4

To provide a little bit more information on packrat, which I use for this purpose. From the website.

R package dependencies can be frustrating. Have you ever had to use trial-and-error to figure out what R packages you need to install to make someone else’s code work–and then been left with those packages globally installed forever, because now you’re not sure whether you need them? Have you ever updated a package to get code in one of your projects to work, only to find that the updated package makes code in another project stop working?

We built packrat to solve these problems. Use packrat to make your R projects more:

Isolated: Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library. Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on. Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.

Packrat stores the version of the packages you use in the packrat.lock file, and then downloads that version from CRAN whenever you packrat::restore(). It is much lighter weight than devtools, but can still take some time to re-download all of the packages (depending on the packages you are using).

If you prefer to store all of the sources in a zip file, you can use packrat::snapshot() to pull down the sources / update the packrat.lock and then packrat::bundle() to "bundle" everything up. The aim for this is to make projects / research reproducible and portable over time by storing the package versions and dependencies used on the original design (along with the source code so that the OS dependency on a binary is avoided).

There is much more information on the website I linked to, and you can see current activity on the git repo. I have encountered a few cases that work in a less-than-ideal way (packages not on CRAN have some issues at times), but the git repo still seems to be pretty active with issues/patches which is encouraging.

cole
  • 1,737
  • 2
  • 15
  • 21