My apologies because I think this may be a simple question but it is something that I am really struggling to understand!
As a background, I am trying to create a Dockerfile
which installs a lot of R CRAN
and R Bioconductor
packages as well as some R packages from Github
. I want to do this as quickly as possible so I'm using rocker
's base image to install binary files, see here for a great, quick tutorial: https://datawookie.dev/blog/2019/01/docker-images-for-r-r-base-versus-r-apt/
My approach is first to install all my necessary packages as binaries and, if any are not available install them from source. After this, I use the Bioconductor
base image to install the necessary Bioconductor
packages.
However, the packages I installed through the rocker
base image aren't available after I import the Bioconductor
base image. This is where I feel I don't have a clear understanding of creating Dockerfiles but I can't seem to find an answer in any documentation. Is there some way to copy these over after importing another image? I didn't know if this is necessary, I have seen others do it the same way, such as the question poster here: Minimizing the size of docker image R shiny app
To note, I import the Bioconductor
base image as I thought this would help deal with dependency issues. I guess I could just install the Bioconductor
packages like the R packages that weren't available as binaries but I want to do this as quickly and cleanly as possible and I thought that this would slow things down.
Essentially, I want to know what's the quickest way to install, R binaries, R non-binaries, R bioconductor and github packages all in one dockerfile.
An example of my approach is below with a very small subset of the packages I need. Note, I have shown my full approach to install R binaries, R non-binaries, R bioconductor and github packages but for the issue I am having see what happens to the tidyverse
R package before and after I import the Bioconductor
image; the call library(tidyverse)
runs before but fails after:
Dockerfile
## Use r-ubuntu, prev r-apt:bionic to enable the use of binary r packages for speed for R 4.0
FROM rocker/r-ubuntu:18.04
## Install available binaries - for speed
RUN apt-get update && \
apt-get install -y -qq \
r-cran-tidyverse \
r-cran-ids \
r-cran-snow
## Install remaining packages from source
COPY ./requirements-src.R .
RUN Rscript requirements-src.R
## This works
RUN R -e 'library(tidyverse)'
## Install Bioconductor packages
# Docker inheritance
FROM bioconductor/bioconductor_docker:RELEASE_3_12
COPY ./requirements-bioc.R .
#Don't bother running for speed but this will run
#RUN R -e 'BiocManager::install(ask = F)' && Rscript requirements-bioc.R
#This will fail - can't find the package
RUN R -e 'library(tidyverse)'
## Install from GH the following
#Don't bother running for speed but this will run
#RUN installGithub.r mojaveazure/loomR
EXPOSE 8787
## Make R the default
CMD [”R”]
requirements-src.R
pkgs <- c(
'spelling',
'english',
'DT'
)
install.packages(pkgs)
requirements-bioc.R
bioc_pkgs<-c(
'biomaRt',
'DropletUtils',
'rhdf5'
)
BiocManager::install(bioc_pkgs,ask=F)