50

I made a Docker container which is fairly large. When I commit the container to create an image, the image is about 7.8 GB big. But when I export the container (not save the image!) to a tarball and re-import it, the image is only 3 GB big. Of course the history is lost, but this OK for me, since the image is "done" in my opinion and ready for deployment.

How can I flatten an image/container without exporting it to the disk and importing it again? And: Is it a wise idea to do that or am I missing some important point?

Thomas Uhrig
  • 30,811
  • 12
  • 60
  • 80
  • 2
    Do you build with the ```--rm``` option? This removes intermediate images. Or did I misunderstand the question? – shabbychef Apr 02 '14 at 18:45
  • 1
    There are some other tricks to make the image smaller: call a bunch of install commands in one ```RUN```, remove unneeded ubuntu packages, _etc._ See https://github.com/dckc/ipython-docker/blob/master/Dockerfile for a good example. – shabbychef Apr 02 '14 at 18:47

4 Answers4

52

Now that Docker has released the multi-stage builds in 17.05, you can reformat your build to look like this:

FROM buildimage as build
# your existing build steps here
FROM scratch
COPY --from=build / /
CMD ["/your/start/script"]

The result will be your build environment layers are cached on the build server, but only a flattened copy will exist in the resulting image that you tag and push.


Note, you would typically reformulate this to have a complex build environment and only copy over a few directories. Here's an example with Go to make a single binary image from source code and a single build command without installing Go on the host and compiling outside of docker:

$ cat Dockerfile 
ARG GOLANG_VER=1.8
FROM golang:${GOLANG_VER} as builder
WORKDIR /go/src/app
COPY . .
RUN go-wrapper download 
RUN go-wrapper install

FROM scratch
COPY --from=builder /go/bin/app /app
CMD ["/app"]

The go file is a simple hello world:

$ cat hello.go 
package main

import "fmt"

func main() {
        fmt.Printf("Hello, world.\n")
}

The build creates both environments, the build environment and the scratch one, and then tags the scratch one:

$ docker build -t test-multi-hello .                                                                                                                              
Sending build context to Docker daemon  4.096kB
Step 1/9 : ARG GOLANG_VER=1.8
 ---> 
Step 2/9 : FROM golang:${GOLANG_VER} as builder
 ---> a0c61f0b0796
Step 3/9 : WORKDIR /go/src/app
 ---> Using cache
 ---> af5177aae437
Step 4/9 : COPY . .
 ---> Using cache
 ---> 976490d44468
Step 5/9 : RUN go-wrapper download
 ---> Using cache
 ---> e31ac3ce83c3
Step 6/9 : RUN go-wrapper install
 ---> Using cache
 ---> 2630f482fe78
Step 7/9 : FROM scratch
 ---> 
Step 8/9 : COPY --from=builder /go/bin/app /app
 ---> Using cache
 ---> 5645db256412
Step 9/9 : CMD /app
 ---> Using cache
 ---> 8d428d6f7113
Successfully built 8d428d6f7113
Successfully tagged test-multi-hello:latest

Looking at the images, only the single binary is in the image being shipped, while the build environment is over 700MB:

$ docker images | grep 2630f482fe78
<none>                <none>              2630f482fe78        6 days ago          700MB

$ docker images | grep 8d428d6f7113
test-multi-hello      latest              8d428d6f7113        6 days ago          1.56MB

And yes, it runs:

$ docker run --rm test-multi-hello 
Hello, world.
BMitch
  • 231,797
  • 42
  • 475
  • 450
  • 1
    This should be the accepted answer by this time.. It's very effective and flexible! – Dean Christian Armada Sep 20 '17 at 14:07
  • 6
    note that with this approach the workdir, entrypoint, env, etc would be gone, you must repeat them. Apart of that, perfect! :) – Keymon May 11 '18 at 15:27
  • 10
    Note that `COPY` rewrites all filesystem ownership to root (or the user running the command) which may not be desirable. – ti7 Jul 03 '18 at 19:13
41

Up from Docker 1.13, you can use the --squash flag.


Before version 1.13:

To my knowledge, you cannot using the Docker api. docker export and docker import are designed for this scenario, as you yourself already mention.

If you don't want to save to disk, you could probably pipe the outputstream of export into the input stream of import. I have not tested this, but try

docker export red_panda | docker import - exampleimagelocal:new
qkrijger
  • 26,686
  • 6
  • 36
  • 37
  • 1
    I just did this with "Docker version 1.1.1, build bd609d2" and the resulting image wasn't much smaller. It actually was a bit bigger. But the history was gone for the new image. – VolkerK Jul 17 '14 at 08:25
  • 3
    Works well with Docker 17.03.0-ce. From 33GB Image size was reduced to 19GB – Hitman_99 Mar 31 '17 at 07:05
  • 2
    `--squash` requires experimental flags for some daemons. `export` works on containers, not images. Do you have any suggestions on how to convert an image into a container without running anything inside that image? One idea I have is to run an entrypoint that doesn't exist in the container. That seems to still create a new container. – init_js Nov 23 '18 at 10:11
  • The metadata of the source image is not preserved across export+import. ([issue](https://github.com/moby/moby/issues/8334#issuecomment-57471427)). So you lose ENTRYPOINT, ENV, and CMD that way. Maybe `--squash` is the better way to flatten _and_ keep the metadata? – init_js Apr 10 '19 at 06:09
  • 9
    Where can you use the `--squash` flag? – Nic Jun 18 '19 at 16:36
  • 3
    docker build --squash -f . – Steve Scott Sep 29 '19 at 23:45
  • This is still marked as experimental as of end of 2019. – Rob Grant Dec 22 '19 at 16:37
  • 1
    So you can't use squash on an existing image. e.g. one that you have committed from a container. – MrR Jun 09 '20 at 22:02
  • A caution, `docker export CONTAINER` does not export the contents of mounted volumes. – Jesse Chisholm Jun 11 '20 at 01:06
  • When squashing, you lose the ability to share layers between images. – Rax Jun 29 '21 at 20:29
3

Take a look at docker-squash

Install with:

pip install docker-squash

Then, if you have a image, you can squash all layers into 1 with

docker-squash -f <nr_layers_to_squash> -t new_image:tag existing_image:tag

A quick 1-liner that is useful for me to squash all layers:

docker-squash -f $(($(docker history $IMAGE_NAME | wc -l | xargs)-1)) -t ${IMAGE_NAME}:squashed $IMAGE_NAME
Roman
  • 8,826
  • 10
  • 63
  • 103
2

Build the image with the --squash flag:

https://docs.docker.com/engine/reference/commandline/build/#squash-an-images-layers---squash-experimental

Also consider mopping up unneeded files, such as the apt cache:

RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Ryan H.
  • 7,374
  • 4
  • 39
  • 46
thetoolman
  • 2,144
  • 19
  • 11