40

I have installed R using below line in my docker file. Please suggest how do I specify now packages to be installed in my docker file.

RUN yum -y install R-core R-devel

I'm doing something like this:

RUN R -e "install.packages('methods',dependencies=TRUE, repos='http://cran.rstudio.com/')"\
    && R -e "install.packages('jsonlite',dependencies=TRUE, repos='http://cran.rstudio.com/')" \
    && R -e "install.packages('tseries',dependencies=TRUE, repos='http://cran.rstudio.com/')" 

Is this the right way to do?

Ashag
  • 837
  • 2
  • 15
  • 24
  • Possible related post: https://stackoverflow.com/questions/6907937/how-to-install-dependencies-when-using-r-cmd-install-to-install-r-packages – Damian Jul 24 '17 at 20:57
  • link you gave is not about dockerfile at all. Can you suggest how do i do if you have any idea? – Ashag Jul 24 '17 at 21:01
  • Is it possible to run commands from a shell prompt or run an R script via Docker? The post presented multiple alternatives for installing packages--I thought it might be possible one of them would be applicable to your situation, even if docker was not specifically mentioned. – Damian Jul 24 '17 at 21:08
  • 1
    R won't return a failure code in this case if, for example, you ask for a package that is not available) -- meaning you'll end up ignoring a build failure. install.packages doesn't return anything other than NULL... probably you want to do install.packages(...) followed by matching library(...) and exit if library(...) fails. – Cameron Kerr Oct 03 '18 at 22:52

9 Answers9

42

As suggested by @Cameron Kerr's comment, Rscript does not give you a build failure. As of now, the recommended way is to do as the question suggests.

RUN R -e "install.packages('methods',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('jsonlite',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('tseries',dependencies=TRUE, repos='http://cran.rstudio.com/')" 

If you're fairly certain of no package failures then use this one-liner -

RUN R -e "install.packages(c('methods', 'jsonlite', 'tseries'),
                           dependencies=TRUE, 
                           repos='http://cran.rstudio.com/')"

EDIT: If you're don't use the Base-R image, you can use rocker-org's r-ver or r-studio or tidyverse images. Here's the repo. Here's an example Dockerfile -

FROM rocker/tidyverse:latest

# Install R packages
RUN install2.r --error \
    methods \
    jsonlite \
    tseries

The --error flag is optional, it makes install.packages() throw an error if the package installation fails (which will cause the docker build command to fail). By default, install.packages() only throws a warning, which means that a Dockerfile can build successfully even if it has failed to install the package.

All rocker-org's basically installs the littler package for the install2.R functionality

Ic3fr0g
  • 1,199
  • 15
  • 26
  • 2
    useful in my .gitlab-ci.yml – Ferroao Sep 02 '19 at 20:46
  • 1
    install.packages() doesn't control the package version. If all this is hypothetically run 5 years from now, will it try to install a package version that's too new and conflicts with the R version? Is it better to use devtools::install_version()? – Arthur Apr 21 '23 at 20:16
  • Agreed. You could contribute [here](https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/bin/install2.r) and modify it to include package dependencies. – Ic3fr0g May 02 '23 at 04:05
8

Yes, your solution should work. I came across the same problem and found the solution here https://github.com/glamp/r-docker/blob/master/Dockerfile.

In short, use: RUN Rscript -e "install.packages('PACKAGENAME')". I have tried it and it works.

As others have mentioned in the comments, this solution will not raise an error if the package could not be installed.

elevendollar
  • 1,115
  • 10
  • 19
  • 5
    Unfortunately this doesn't cause a Docker build failure when install.packages fails. – Cameron Kerr Oct 03 '18 at 22:53
  • 1
    @CameronKerr is correct, `Rscript -e "install.packages('devtoolsssss')" || echo FAIL` does not echo FAIL, so this isn't returning the needed non-zero exit code. – Terry Brown Dec 21 '20 at 22:53
5

This is ugly but it works - see below for real world example of why it's worth doing.

# install packages and check installation success, install.packages itself does not report fails
RUN R -e "install.packages('RMySQL');     if (!library(RMySQL, logical.return=T)) quit(status=10)" \
 && R -e "install.packages('devtools');   if (!library(devtools, logical.return=T)) quit(status=10)" \
 && R -e "install.packages('data.table'); if (!library(data.table, logical.return=T)) quit(status=10)" \
 && R -e "install.packages('purrr');      if (!library(purrr, logical.return=T)) quit(status=10)" \
 && R -e "install.packages('tidyr');      if (!library(tidyr, logical.return=T)) quit(status=10)"

Real world example: devtools install starts failing because it suddenly needs libgit2-dev. install.packages() prints informative info. about the failure, but without a non-zero exit code, that just scrolls away as docker build continues.

Terry Brown
  • 1,300
  • 14
  • 13
5

The best solution I found is with install2.r from the littler package.

  • First install littler
RUN R -e "install.packages('littler', dependencies=TRUE)"
  • Then you can use it from bash in your Dockerfile
RUN install2.r --error --deps TRUE methods
RUN install2.r --error --deps TRUE jsonlite
RUN install2.r --error --deps TRUE tseries

The --error flag makes the build quit if the package has not been installed correctly. The --deps TRUE flag is for automatically installing the dependencies for the package

# corrected used of package name ('litter' to 'littler')

battgo827
  • 3
  • 1
3

The R -e "install.packages..." approach does not always produce an error when package installation fails.

I wrote a script based on Cameron Kerr's answer here, which produces an error if the package cannot be loaded, and interrupts the Docker build process. It installs packages from either an R package repo, from GitHub, or from source given a full URL. It also prints the time taken to install, to help plan which packages to group together in one command.

Example usage in Dockerfile:

# Install from CRAN repo:
RUN Rscript install_packages_or_die.R https://cran.rstudio.com/ Cairo
RUN Rscript install_packages_or_die.R Cairo # Uses default CRAN repo
RUN Rscript install_packages_or_die.R jpeg png tiff # Multiple packages

# Install from GitHub:
RUN Rscript install_packages_or_die.R github ramnathv/htmlwidgets
RUN Rscript install_packages_or_die.R github timelyportfolio/htmlwidgets_spin spin

# Install from source given full URL of package:
RUN Rscript install_packages_or_die.R https://cran.r-project.org/src/contrib/Archive/curl/curl_4.0.tar.gz curl

Here's the script:

#!/usr/bin/env Rscript

# Install R packages or fail with error.
#
# Arguments:
#   - First argument (optional) can be one of:
#       1. repo URL
#       2. "github" if installing from GitHub repo (requires that package 'devtools' is
#          already installed)
#       3. full URL of package from which to install from source; if used, provide package
#          name in second argument (e.g. 'curl')
#     If this argument is omitted, the default repo https://cran.rstudio.com/ is used.
#   - Remaining arguments are either:
#       1. one or more R package names, or
#       2. if installing from GitHub, the path containing username and repo name, e.g.
#          'timelyportfolio/htmlwidgets_spin', optionally followed by the package name (if
#          it differs from the GitHub repo name, e.g. 'spin').

arg_list = commandArgs(trailingOnly=TRUE)

if (length(arg_list) < 1) {
  print("ERROR: Too few arguments.")
  quit(status=1, save='no')
}

if (arg_list[1] == 'github' || grepl("^https?://", arg_list[1], perl=TRUE)) {
  if (length(arg_list) == 1) {
    print("ERROR: No package name provided.")
    quit(status=1, save='no')
  }
  repo = arg_list[1]
  packages = arg_list[-1]
} else {
  repo = 'https://cran.rstudio.com/'
  packages = arg_list
}

for(i in seq_along(packages)){
    p = packages[i]

    start_time <- Sys.time()
    if (grepl("^https?://[A-Za-z0-9.-]+/.+\\.tar\\.gz$", repo, perl=TRUE)) {
      # If 'repo' is URL with path after domain name, treat it as full path to a package
      # to be installed from source.
      install.packages(repo, repo=NULL, type="source");
    } else if (repo == "github") {
      # Install from GitHub.
      github_path = p
      elems = strsplit(github_path, '/')
      if (lengths(elems) != 2) {
        print("ERROR: Invalid GitHub path.")
        quit(status=1, save='no')
      }
      username = elems[[1]][1]
      github_repo_name = elems[[1]][2]
      if (!is.na(packages[i+1])) {
        # Optional additional argument was given specifying the R package name.
        p = packages[i+1]
      } else {
        # Assume R package name is the same as GitHub repo name.
        p = github_repo_name
      }

      library(devtools)
      install_github(github_path)
    } else {
      # Install from R package repository.
      install.packages(p, dependencies=TRUE, repos=repo);
    }
    end_time <- Sys.time()

    if ( ! library(p, character.only=TRUE, logical.return=TRUE) ) {
      quit(status=1, save='no')
    } else {
      cat(paste0("Time to install ", p, ":\n"))
      print(end_time - start_time)
    }

    if (repo == "github") {
      break
    }
}
DavidArndt
  • 406
  • 2
  • 6
1

You could write an R script with the desired install commands, then run it using Docker--if I'm reading this documentation correctly (https://hub.docker.com/_/r-base/).

FROM r-base
COPY . /usr/local/src/myscripts
WORKDIR /usr/local/src/myscripts
CMD ["Rscript", "myscript.R"]

Build your image with the command:

$ docker build -t myscript /path/to/Dockerfile

Where myscript.R contains the appropriate package installation commands.

Damian
  • 1,385
  • 10
  • 10
  • Running R script is not possible. – Ashag Jul 24 '17 at 21:30
  • 3
    I'm doing something like this: RUN R -e "install.packages('methods',dependencies=TRUE, repos='http://cran.rstudio.com/')"\ && R -e "install.packages('jsonlite',dependencies=TRUE, repos='http://cran.rstudio.com/')" \ && R -e "install.packages('tseries',dependencies=TRUE, repos='http://cran.rstudio.com/')" – Ashag Jul 24 '17 at 21:30
  • 1
    I'm not really familiar with Docker, but the documentation seems to recommend using `CMD ["Rscript", "myscript.R"]`, where myscript contains the install commands, instead of `RUN R -e` with the install command as a parameter on the command line – Damian Jul 24 '17 at 21:42
1

Are these repositories a solution to this problem?

My solution in this repository is to create two Docker images: The "install image": The first image consists only of the prerequisites for the projects. When running a container from this image it can install R packages in the format it needs inside the container and save them to {renv}'s cache on the host through a mount. The "final image": The second image copies the project along with dependencies from the host into the image.

Takuro Ikeda
  • 158
  • 6
0

I would like to recommend the rocker/tidyverse image, on top of which you can install other packages like this:

RUN R -e "install.packages('bigrquery',dependencies=TRUE, repos='http://cran.rstudio.com/')"

The same installation from r-base was followed by an issue with Rserve, which, probably, was preinstalled in r-base image. I found nothing about this on the page about r-base, so I do not recommend r-base as an easy solution.

Installation of R packages could be also done with apt-get install r-cran-*, but maintainers of rocker/tidyverse do not recommend it for this particular image because this will lead to the installation of another R version. However, you may check it out and find out it is fine for your task.

0

Simple solution

Replace this

install.packages("shiny")

with this

options(warn=2); install.packages("shiny")

Why?

Setting options(warn=2) ensures you get a build error if something goes wrong when installing packages. That matters because without it, you'll just get a warning, meaning the build will continue and you may not be aware if something goes wrong!


Tips

Inside a Dockerfile, it should look like this:

RUN R -e "options(warn=2); install.packages('shiny')"

Note that if you run the warn=2 code inside RStudio, you'll still just get a warning (not an error) if something goes wrong, that's expected, and is because install.packages() is programmed to perform differently in RStudio, so even with warn=2, you won't get an error there. That's a little odd, but expected. If you want to test it in an environment other than a Dockerfile, try running R from a terminal and testing it there instead.

stevec
  • 41,291
  • 27
  • 223
  • 311