The reference is to caching of the layers. Anytime you run the same command against the same previous layer, Docker will attempt to reuse the cached layer for that command.
So if you add another package to your list a few months from now and rerun the docker build
, if you made two separate RUN commands, the apt-get update
layer would be reused from the cache and you'd have a 3 month old cache in your image. The attempt to install the packages in the new apt-get install
command on the second RUN would fail from any old packages that are no longer in the package repository.
By making it a single RUN command, it's a single layer in the filesystem cache, so it reruns the update on your rebuild months from now and you do the install on packages that are currently in the package repository.
Edit: Seems this still isn't clear, here's a sample scenario of how it goes wrong:
Using the following Dockerfile:
FROM debian:latest
RUN apt-get update
RUN apt-get install -y \
bzr \
cvs \
git \
mercurial \
subversion
When I run docker built -t my-app:latest .
it outputs a long list that ends with:
Processing triggers for libc-bin (2.19-18+deb8u4) ...
Processing triggers for systemd (215-17+deb8u4) ...
Processing triggers for ca-certificates (20141019+deb8u1) ...
Updating certificates in /etc/ssl/certs... 174 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d....done.
Processing triggers for sgml-base (1.26+nmu4) ...
---> 922e466ac74b
Removing intermediate container 227318b98393
Successfully built 922e466ac74b
Now, if I change this file to add unzip to the package list, and assume it's months later so the apt-get update
now contains stale data:
FROM debian:latest
RUN apt-get update
RUN apt-get install -y \
bzr \
cvs \
git \
mercurial \
subversion \
unzip
If I run that right now, it will work:
Step 1 : FROM debian:latest
---> 1b088884749b
Step 2 : RUN apt-get update
---> Using cache
---> 81ca47119e38
Step 3 : RUN apt-get install -y bzr cvs git mercurial subversion unzip
---> Running in 87cb8380ec90
Reading package lists...
Building dependency tree...
The following extra packages will be installed:
ca-certificates dbus file fontconfig fontconfig-config fonts-dejavu-core
gir1.2-glib-2.0 git-man gnupg-agent gnupg2 hicolor-icon-theme
....
Processing triggers for libc-bin (2.19-18+deb8u4) ...
Processing triggers for systemd (215-17+deb8u4) ...
Processing triggers for ca-certificates (20141019+deb8u1) ...
Updating certificates in /etc/ssl/certs... 174 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d....done.
Processing triggers for sgml-base (1.26+nmu4) ...
---> d6d1135481d3
Removing intermediate container 87cb8380ec90
Successfully built d6d1135481d3
But if you look at the above output, the apt-get update
shows:
---> Using cache
Which means it didn't run the update, it just reused an old layer that ran that step before. When that's only 5 minutes old, it's no issue. But when it's months old, you'll see errors.
The fix, as Docker mentions, is to run the update and install as the same run step, so that when the install cache is invalidated, the update also reruns.