2

I am using the exact same section in two different Dockerfiles (base is ubuntu:18.04 in both cases) that downloads two files from a remote location using wget.

ENV ROBOT v1.5.0
ENV ROBOT_JAR=https://github.com/ontodev/robot/releases/download/$ROBOT/robot.jar
RUN wget $ROBOT_JAR -O /tools/robot.jar && \
    wget https://raw.githubusercontent.com/ontodev/robot/$ROBOT/bin/robot -O /tools/robot && \
    chmod +x /tools/*

docker history --no-trunc [...]

tells me that in the one Dockerfile, the layer created by this command, is 114 MB:

... /bin/sh -c wget $ROBOT_JAR -O /tools/robot.jar &&     wget https://raw.githubusercontent.com/ontodev/robot/$ROBOT/bin/robot -O /tools/robot &&     chmod +x /tools/* 114MB   

and in the other only 44.9MB:

... /bin/sh -c wget $ROBOT_JAR -O /tools/robot.jar &&     wget https://raw.githubusercontent.com/ontodev/robot/$ROBOT/bin/robot -O /tools/robot &&     chmod +x /tools/* 44.9MB              

Apart from being the same base, the Dockerfiles are of course very different (the 114MB one is huge, for example, while the 45MB one has only two defined layers); I am curious: What could cause the difference in size? Can this be mitigated somehow?

EDIT 1:

Here is the 114 MB case: https://github.com/INCATools/ontology-development-kit/blob/master/Dockerfile

Here is the other: https://github.com/INCATools/ontology-development-kit/blob/master/docker/testdocker/Dockerfile

matentzn
  • 337
  • 1
  • 10
  • Are there other things in the `/tools` directory in the image? Can you include more context on what else is happening in the two Dockerfiles? – David Maze Jan 24 '20 at 11:08
  • Thanks for looking at it! I updated the question with links to the two dockerfiles in question. Yes, there are many more things in the /tools directory. Could this play a role? – matentzn Jan 24 '20 at 11:32

1 Answers1

3

The chmod +x is the reason. Every time you change a file, even just chmod or chown, a whole new duplicate copy of the file is stored in the next layer.

You can use multi-stage builds to create a final image that doesn't have all the intermediary layers, e.g. if you're using Python here's a guide to it: https://pythonspeed.com/articles/smaller-python-docker-images/

And here's the generic Docker docs: https://docs.docker.com/develop/develop-images/multistage-build/

Itamar Turner-Trauring
  • 3,430
  • 1
  • 13
  • 17
  • Thank you for this! A related issue for what you outlined can be found here https://stackoverflow.com/questions/30085621/why-does-chown-increase-size-of-docker-image (now that I know what I needed to search for). – matentzn Jan 27 '20 at 15:24