34

This question is part of my continuing exploration of Docker and in some ways follows up on one of my earlier questions. I have now understood how one can get a full application stack (effectively a mini VPS) working by linking together a bunch of Docker containers. For example one could create a stack that provides Apache + PHP5 with a sheaf of extensions + Redis + MemCached + MySQL all running on top of Ubuntu with or without an additional data container to make it easy to serialize user data.

All very nice and elegant. However, I cannot but help wonder... . 5 containers to run that little VPS (I count 5 not 6 since Apache + PHP5 go into one container). So suppose I have 100 such VPSs running? That means I have 500 containers running! I understand the arguments here - it is easy to compose new app stacks, update one component of the stack etc. But are there no unnecessary overheads to operating this way?

Suppose I did this

  • Put all my apps inside one container
  • Write up a little shell script

    !/bin/bash service memcached start service redis-server start .... service apache2 start while: do : done

In my Dockerfile I have

ADD start.sh /usr/local/bin/start.sh
RUN chmod +x /usr/local/bin/start.sh

....
ENTRYPOINT ["/bin/bash"]
CMD ["/usr/local/bin/start.sh"]

I then get that container up & running

docker run -d -p 8080:80 -v /var/droidos/site:/var/www/html -v /var/droidos/logs:/var/log/apache2 droidos/minivps

and I am in business. Now when I want to shut down that container programmatically I can do so by executing one single docker command.

There are many questions of a similar nature to be found when one Google's for them. Apart from the arguments I have reproduced above one of the commonest reasons given for the one-app-per-container approach is "that is the way Docker is designed to work". What I would like to know

  • What are the downsides to running x100 instances of N linked containers - tradeoffs by way of speed, memory usage etc on the host?
  • What is wrong with what I have done here?
Community
  • 1
  • 1
DroidOS
  • 8,530
  • 16
  • 99
  • 171
  • 1
    might want to take a look at this https://phusion.github.io/baseimage-docker/ as by the developers, it solves the majority of the problems that exist when more than one app runs within a Docker image – nassim Jan 30 '20 at 20:21

2 Answers2

24

A container is basically a process. There is no technical issue with running 500 processes on a decent-sized Linux system, although they will have to share the CPU(s) and memory.

The cost of a container over a process is some extra kernel resources to manage namespaces, file systems and control groups, and some management structures inside the Docker daemon, particularly to handle stdout and stderr.

The namespaces are introduced to provide isolation, so that one container does not interfere with any others. If your groups of 5 containers form a unit that does not need this isolation then you can share the network namespace using --net=container. There is no feature at present to share cgroups, AFAIK.

What is wrong with what you suggest:

  • it is not "the Docker way". Which may not be important to you.
  • you have to maintain the scripting to get it to work, worry about process restart, etc., as opposed to using an orchestrator designed for the task
  • you will have to manage conflicts in the filesystem, e.g. two processes need different versions of a library, or they both write to the same output file
  • stdout and stderr will be intermingled for the five processes
Bryan
  • 11,398
  • 3
  • 53
  • 78
  • thank you for your answer. I wonder if you could clarify two things: 1. The docs say " Tells Docker to put this container's processes inside ... and processes on the two containers will be able to connect to each other over the loopback interface." Does this mean that with --net apache I will be able to access,say, memcached running in another container on localhost? With the one-app-per container route you still do require scripting to start & link the containers + a script to shutdown & remove all containers. So six of one & half a dozen of the other? – DroidOS Dec 27 '14 at 10:36
  • 2
    1. Yes, when you use `--net=another_container` they are put in the same network namespace, with the same virtual Ethernet device and loopback device. They can talk to each other, but that is also true of `--net=bridge` which you get by default. The noticeable thing about sharing the network is they share IP addresses and ports. And use a bit less system resources, which is what you asked about. – Bryan Dec 28 '14 at 09:22
  • 1
    2. There are loads of tools to control Docker containers from the outside - see http://stackoverflow.com/a/18287169/448734. So you don't have to do so much of the scripting yourself. Note that this is a fast-evolving environment. – Bryan Dec 28 '14 at 09:30
7

@Bryan's answer is solid, particularly in relation to the overheads of a container that just runs one process being low.

That said, you should at least read the arguments at https://phusion.github.io/baseimage-docker/, which makes a case for having containers with multiple processes. Without them, docker is light on provision for:

  • process supervision
  • cron jobs
  • syslog

baseimage-docker runs an init process which fires up a few processes besides the main one in the container.

For some purposes this is a good idea, but also be aware that for instance having a cron daemon and a syslog daemon per container adds up a bit more overhead. I expect that as the docker ecosystem matures we'll see better solutions that don't require this.

mc0e
  • 2,699
  • 28
  • 25