I understand this question has some popular answers already. But there is a newer way to cache files for package managers. I think it could be a good answer in the future when BuildKit becomes more standard.
As of Docker 18.09 there is experimental support for BuildKit. BuildKit adds support for some new features in the Dockerfile including experimental support for mounting external volumes into RUN
steps. This allows us to create caches for things like $HOME/.cache/pip/
.
We'll use the following requirements.txt
file as an example:
Click==7.0
Django==2.2.3
django-appconf==1.0.3
django-compressor==2.3
django-debug-toolbar==2.0
django-filter==2.2.0
django-reversion==3.0.4
django-rq==2.1.0
pytz==2019.1
rcssmin==1.0.6
redis==3.3.4
rjsmin==1.1.0
rq==1.1.0
six==1.12.0
sqlparse==0.3.0
A typical example Python Dockerfile
might look like:
FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
COPY . /usr/src/app
With BuildKit enabled using the DOCKER_BUILDKIT
environment variable we can build the uncached pip
step in about 65 seconds:
$ export DOCKER_BUILDKIT=1
$ docker build -t test .
[+] Building 65.6s (10/10) FINISHED
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load metadata for docker.io/library/python:3.7 0.5s
=> CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092 0.0s
=> [internal] load build context 0.6s
=> => transferring context: 899.99kB 0.6s
=> CACHED [internal] helper image for file operations 0.0s
=> [2/4] COPY requirements.txt /usr/src/app/ 0.5s
=> [3/4] RUN pip install -r requirements.txt 61.3s
=> [4/4] COPY . /usr/src/app 1.3s
=> exporting to image 1.2s
=> => exporting layers 1.2s
=> => writing image sha256:d66a2720e81530029bf1c2cb98fb3aee0cffc2f4ea2aa2a0760a30fb718d7f83 0.0s
=> => naming to docker.io/library/test 0.0s
Now, let us add the experimental header and modify the RUN
step to cache the Python packages:
# syntax=docker/dockerfile:experimental
FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
COPY . /usr/src/app
Go ahead and do another build now. It should take the same amount of time. But this time it is caching the Python packages in our new cache mount:
$ docker build -t pythontest .
[+] Building 60.3s (14/14) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 0.5s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3 0.0s
=> [internal] load .dockerignore 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load metadata for docker.io/library/python:3.7 0.5s
=> CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092 0.0s
=> [internal] load build context 0.7s
=> => transferring context: 899.99kB 0.6s
=> CACHED [internal] helper image for file operations 0.0s
=> [2/4] COPY requirements.txt /usr/src/app/ 0.6s
=> [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt 53.3s
=> [4/4] COPY . /usr/src/app 2.6s
=> exporting to image 1.2s
=> => exporting layers 1.2s
=> => writing image sha256:0b035548712c1c9e1c80d4a86169c5c1f9e94437e124ea09e90aea82f45c2afc 0.0s
=> => naming to docker.io/library/test 0.0s
About 60 seconds. Similar to our first build.
Make a small change to the requirements.txt
(such as adding a new line between two packages) to force a cache invalidation and run again:
$ docker build -t pythontest .
[+] Building 15.9s (14/14) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> resolve image config for docker.io/docker/dockerfile:experimental 1.1s
=> CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 120B 0.0s
=> [internal] load .dockerignore 0.0s
=> [internal] load metadata for docker.io/library/python:3.7 0.5s
=> CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092 0.0s
=> CACHED [internal] helper image for file operations 0.0s
=> [internal] load build context 0.7s
=> => transferring context: 899.99kB 0.7s
=> [2/4] COPY requirements.txt /usr/src/app/ 0.6s
=> [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt 8.8s
=> [4/4] COPY . /usr/src/app 2.1s
=> exporting to image 1.1s
=> => exporting layers 1.1s
=> => writing image sha256:fc84cd45482a70e8de48bfd6489e5421532c2dd02aaa3e1e49a290a3dfb9df7c 0.0s
=> => naming to docker.io/library/test 0.0s
Only about 16 seconds!
We are getting this speedup because we are no longer downloading all the Python packages. They were cached by the package manager (pip
in this case) and stored in a cache volume mount. The volume mount is provided to the run step so that pip
can reuse our already downloaded packages. This happens outside any Docker layer caching.
The gains should be much better on larger requirements.txt
.
Notes:
- This is experimental Dockerfile syntax and should be treated as such. You may not want to build with this in production at the moment.
The BuildKit stuff doesn't work under Docker Compose or other tools that directly use the Docker API at the moment. There is now support for this in Docker Compose as of 1.25.0. See How do you enable BuildKit with docker-compose?
- There isn't any direct interface for managed the cache at the moment. It is purged when you do a
docker system prune -a
.
Hopefully, these features will make it into Docker for building and BuildKit will become the default. If / when that happens I will try to update this answer.