5

Is it possible to remove a file completely from an intermediate Docker layer?

E.g. I have:

Layer A: with BigFile
 L Layer B: remove BigFile
   L Layer C: builds on B

As files are still present in containers, even though BigFile isn't accessible, it is present in C. And therefore adds to the overall image size.

What I'd like though is to remove BigFile completely from C.

What's the solution? Is it just to remove it from Layer A in the first place?

Snowcrash
  • 80,579
  • 89
  • 266
  • 376

2 Answers2

4

What's the solution? Is it just to remove it from Layer A in the first place?

Once a file is in a layer it can't be removed, but if you're careful to remove the file in the same build step they added it, it won't be persisted in that build step's layer.

The simple, common example is to run apt get update, install packages, then remove the package index all in the same RUN command to keep the apt index out of the image layer.

https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#run talks more about this example .

erik258
  • 14,701
  • 2
  • 25
  • 31
  • My situation is that - due to security reasons I am downloading the packages in the intermediate layer. Then, ADD'ing them to the final image. Followed by using RUN command to install the packages and delete the packages. My goal is to delete the downloaded packages once the RUN command installs them. Is there any way to a) combine the ADD and RUN into one layer; b) remove the ADD layer? – variable Jun 03 '20 at 08:48
  • Image squashing is probably the obvious solution for this; see @Rory McCune's answer below. You can also use a multi-stage build to copy from named buid stages. – erik258 Jun 03 '20 at 15:23
  • I am using multi stage build. I am Downloading packages (100mb) in stage 1. In stage 2 I copy the packages(ADD --FROM....) and install them (RUN). After install I want to get rid of the packages. – variable Jun 03 '20 at 18:56
  • cool, okay, yeah the only way to get those separate was if you could get the files that were installed out of the stage that did the ADD/RUN in a separate stage. Live with the packages in your images or squash the layers would by my suggestion – erik258 Jun 03 '20 at 21:25
  • Do you mean sqash the ADD --FROM...(copy packages from previous stage) and RUN(install) layers, and then how do I delete the packages completely? – variable Jun 04 '20 at 06:17
1

To get this, as long as you don't mind losing the image cache, you can squash the image. There's a couple of ways to do this.

With experimental options on docker build you get a --squash option which removes all intermediate layers and would result in BigFile not being present.

Without that you can create a container from Layer C, then use docker export and docker import to effectively remove all the layers, as described in this answer

Rory McCune
  • 1,371
  • 10
  • 17