1

New to docker...

Need some help to clarify basic container concept...

AFAIK, each container would include app. code, library, runtime, cfg files, etc. If I would run N numbers of containers for N numbers of app. and each of the app. happens to use a set of same lib. would it mean my host systems literally end up having N-1 numbers of duplicate libraries?

while container reduces OS overhead in VM approach of virtualization, I am just wondering if the container approach still has room to improve in terms of resource optimization.

Thanks Mira

b.mira
  • 143
  • 1
  • 8

1 Answers1

3

Containers are the runtime instance, defined by an image. Docker uses a unionfs to merge multiple layers together to create the root filesystem you see inside your container. Each step in the build of an image is a layer. And the container itself has a copy-on-write layer attached just to the container so that it sees it's own changes. Because of this, docker is able to point multiple instances of a running image back to the same image files for the unionfs layers, it never copies the layer when you spin up another container, they all point back to the same filesystem bytes.

In short, if you have a 1 gig image, and spin up 100 containers all using that same image, on disk will only be the 1 gig image plus any changes made in those 100 containers, not 100 gigs.

enter image description here

BMitch
  • 231,797
  • 42
  • 475
  • 450
  • Thanks for the reply. That's useful info. However, I am wondering the lib. duplication in the context of different images used. For example, if I create an image 1 with "import X" in a Python app. and another image 2 which also include "import X" in another Python app. Then I run both of these two images (thus the two apps.) in a same host. Would the host ends up importing "X" twice inside of each container? Thanks again. – b.mira Jun 21 '17 at 02:46
  • @b.mira There is caching in the creation of each layer of an image. If the same command has been run on the same prior layer, docker will use it's layer cache instead of creating a new layer, even if they layer was first made by another image. You can be more explicit by creating your own base image with all the common libraries in it, and building you other images off of that. – BMitch Jun 21 '17 at 09:14