9

I'm spinning up 1000 containers on a single Docker network, all from the same Docker image.

Currently it takes a long time to deploy. I've separated the process into docker create and docker start, as opposed to the monolithic docker run.

Is there any way to spin containers up in parallel? - Happy to work in a programming interface (Go, C, whatever), or use CLI commands.

Related: Can Docker Engine start containers in parallel [asked and answered 3 years ago]

A T
  • 13,008
  • 21
  • 97
  • 158
  • 2
    Have you tried Swarm or Docker Compose? – takacsmark Sep 24 '18 at 05:58
  • Aren't those just for running containers that are related to each other? - AFAIK they don't `docker run`/`docker start` them concurrently. – A T Sep 24 '18 at 06:18
  • 5
    You can use either Swarm or Compose to start up a single service (meaning a service that uses a Docker image) and scale the number of containers. Both options are viable, Swarm works on a cluster of nodes and will give you load balancing, too. Compose will not take care of port conflicts and it works on a single machine only. Swarm and Compose both have deployment options so that you can fine tune them to your project. – takacsmark Sep 24 '18 at 07:56
  • @Mark you should make it an answer. – jannis Sep 24 '18 at 08:26
  • , so my `docker create` step which sets the network, IP address and hostname; will not longer be usable? – A T Sep 24 '18 at 08:28
  • Well, this depends on the details of your use case if it works for you, the options are there. You define your network and specify a subnet in the Compose file, the container name will be assigned automatically (with a standard naming pattern). It really depends on what you wanna achieve with these 1000 containers. – takacsmark Sep 24 '18 at 16:57
  • I'm experimenting with different consensus algorithms and I need to know all node names and addresses ahead of time. How about if I dropped into writing Go, will that enable me to `docker start` in parallel? – A T Sep 26 '18 at 07:11
  • 1
    "I need to know all node names and addresses ahead of time", I highly recommend you to implement this with Kubernetes. Speacially using Kops, so the creation of the infrastructure is very easy. Will manage for you networks, dns, creation clusters of instances, etc. Then 100, 1000 or 10000 will be just numbers that trigger scaling. – Robert Sep 27 '18 at 10:10
  • , I've already written the node address + key pair generator in Go, so that won't be a bottleneck… – A T Sep 27 '18 at 11:44

2 Answers2

3

Deploying the containers with Swarm or K8s is just an added layer of abstraction on top of deploying lots of containers with a start command, they won't speed up the process (only make it easier to manage), so I'm not sure why so many are quick to recommend that for your question. More layers of abstraction don't speed up a solution. What they do allow is horizontal scaling, so if you can spread these containers across more docker hosts, an orchestration solution can make that easier to manage and give you some automated recovery from any faults.

The run command is a wrapper around create/start. And all of these docker commands are small wrappers around a REST API to dockerd. You can go directly to that API, but the time is likely spent setting up the namespaces, including IPAM and iptables rules for the networking. I'm not aware of any parallel API's that would speed up the start of lots of containers. One option to speed this up is to remove some of the namespace isolation, or look at other options for drivers used to create a namespace. Switching to host networking completely skips the container bridge network and iptables rules, putting your container in the same network namespace as the host.

Or with networking, you can configure the container with the "none" network to avoid any connections to a bridge network and iptables rules, though you'll still have a loopback address. You do have the option to connect a running container to a network after starting it, so if you don't have published ports, that may be an option to split the start command from the networking setup, depending on your use case.

Beyond that, if you want faster, you may need better or more hardware, perhaps a newer kernel with performance enhancements, or you may need to remove some of the abstraction layers. If you're willing to handle some of the networking and other pieces yourself, you could connect directly to the containerd backend used by docker. Just note that in doing so, you will lose some of the functionality that docker provides.

BMitch
  • 231,797
  • 42
  • 475
  • 450
0

1000 containers in a single VM (Or Baremetal machine) doesn't sound like a scalable solution.

I would suggest a container Orchestration system like Kubernetes,Docker Swarm, Nomad (In the order of my Personal preference) with Multi-node, High Availability configuration. Docker swarm is much simpler to get up and running than it's popular counterpart, Kubernetes (K8s)

so-random-dude
  • 15,277
  • 10
  • 68
  • 113