745

I'd like to comprehensively understand the run-time performance cost of a Docker container. I've found references to networking anecdotally being ~100µs slower.

I've also found references to the run-time cost being "negligible" and "close to zero" but I'd like to know more precisely what those costs are. Ideally I'd like to know what Docker is abstracting with a performance cost and things that are abstracted without a performance cost. Networking, CPU, memory, etc.

Furthermore, if there are abstraction costs, are there ways to get around the abstraction cost. For example, perhaps I can mount a disk directly vs. virtually in Docker.

noɥʇʎԀʎzɐɹƆ
  • 9,967
  • 2
  • 50
  • 67
Luke Hoersten
  • 8,355
  • 3
  • 21
  • 18
  • 4
    possible duplicate of [Is there a formula for calculating the overhead of a Docker container?](http://stackoverflow.com/questions/21799402/is-there-a-formula-for-calculating-the-overhead-of-a-docker-container) – Golo Roden Feb 19 '14 at 18:22
  • 4
    @GoloRoden that question is similar but not exactly the same. I'm looking for latency costs with reasons like "networking is being passed through an extra layer" whereas that question's accepted answer is more about measuring the costs of the container + app. – Luke Hoersten Feb 19 '14 at 18:29
  • 3
    Okay, that's right. I retracted my close vote. – Golo Roden Feb 19 '14 at 18:43
  • 11
    I'm glad you posted it though. That question didn't come up in my search. The measurement/metrics article is super useful: http://blog.docker.io/2013/10/gathering-lxc-docker-containers-metrics/ – Luke Hoersten Feb 19 '14 at 22:15
  • 2
    This is a good session titled "Linux Containers - NextGen Virtualization for Cloud" telling performance metrics by comparing docker, KVM VM and bare metal: https://www.youtube.com/watch?v=a4oOAVhNLjU – shawnzhu May 22 '14 at 03:49
  • I wrote a benchmark for MemSQL, it seems docker version 44% slower, but I don't know which part is the bottleneck, the network/nat part or the cpu.. https://kokizzu.blogspot.com/2019/12/go-orm-benchmark-on-memsql.html – Kokizzu Dec 13 '19 at 21:15
  • Mr. @michael-larabel , If you are _the_ [Michael Larabel](https://www.phoronix.com/), do you have anything to add to this discussion? I see [docker containers on linux distros](https://www.phoronix.com/scan.php?page=search&q=Docker), but I can't seem to find much on Docker vs Native on your fantastic site. – Ross Rogers Jul 24 '21 at 16:34

4 Answers4

675

An excellent 2014 IBM research paper “An Updated Performance Comparison of Virtual Machines and Linux Containers” by Felter et al. provides a comparison between bare metal, KVM, and Docker containers. The general result is: Docker is nearly identical to native performance and faster than KVM in every category.

The exception to this is Docker’s NAT — if you use port mapping (e.g., docker run -p 8080:8080), then you can expect a minor hit in latency, as shown below. However, you can now use the host network stack (e.g., docker run --net=host) when launching a Docker container, which will perform identically to the Native column (as shown in the Redis latency results lower down).

Docker NAT overhead

They also ran latency tests on a few specific services, such as Redis. You can see that above 20 client threads, highest latency overhead goes Docker NAT, then KVM, then a rough tie between Docker host/native.

Docker Redis Latency Overhead

Just because it’s a really useful paper, here are some other figures. Please download it for full access.

Taking a look at Disk I/O:

Docker vs. KVM vs. Native I/O Performance

Now looking at CPU overhead:

Docker CPU Overhead

Now some examples of memory (read the paper for details, memory can be extra tricky):

Docker Memory Comparison

ib.
  • 27,830
  • 11
  • 80
  • 100
Hamy
  • 20,662
  • 15
  • 74
  • 102
  • I'd expect KVM to have significantly less networking overhead if using PCI-IOV hardware with PCI passthrough configured and enabled. Which certainly isn't a configuration for the faint-of-heart, granted. – Charles Duffy Jan 08 '15 at 03:34
  • 29
    As for the linpack numbers given in the paper... frankly, I find them hard to believe (not that I disbelieve that they're what linpack emitted, but that I disbelieve that the test was genuinely measuring nothing but floating-point performance as performed). The major overhead from KVM is in the userspace hardware emulation components (which only apply to _non-CPU_ hardware); there's significant overhead around memory paging... but raw floating-point? I'd want to look at what was actually going on there -- perhaps excessive context switches. – Charles Duffy Jan 08 '15 at 03:38
  • It seems they have added the Docker numbers for the round-trip latency. And it's a bit worse than KVM. – danuker Mar 26 '15 at 10:44
  • 1
    Thanks @danuker - always better to have some numbers than have to guess like I was doing. I've updated the answer to include the new data – Hamy Mar 30 '15 at 22:04
  • Glad to help, although I just linked to someone else's work ;-) Be sure to skim the paper at least - it's pretty interesting! – Hamy Apr 02 '15 at 21:47
  • unbelievable, docker's performance is super good, nearly the native. – duykhoa Apr 10 '15 at 02:02
  • This docker document speaks about the impacts of setting the `-net=host` flag: https://docs.docker.com/articles/networking/#how-docker-networks-a-container – Michael Allan Jackson Aug 18 '15 at 17:37
  • 4
    Correction for current Docker CLI syntax: `--net=host` (two dashes) and `-p 8080:8080` (lower case 'p') for NAT. – bk0 Dec 17 '15 at 19:55
  • Docker `--net=host` documentation moved to https://docs.docker.com/engine/reference/run/#network-settings – Sam Apr 05 '16 at 04:23
  • @Hamy: Why is there no comparison of throughput? Shouldn't that also be of interest? – arne.z Apr 18 '16 at 13:54
  • @洋葱头 - It's covered in the linked paper, and I didn't want to copy/paste *all* of their diagrams :-) See Section E – Hamy Apr 18 '16 at 15:45
  • 13
    The cited IBM paper seems too focused on network IO. It never addresses context switches. We looked at LXC and had to quickly abandon it due to increased non-voluntary context switches resulting in degraded application processing. – Eric May 05 '17 at 03:09
  • @Eric fantastic info - I'd strongly recommend you take a moment to email the paper's authors with what you found so they can construct a follow-on paper with a 3-year update and some new tests incl. one showing the degradation you experienced – Hamy Jul 05 '17 at 12:24
  • 10
    I'm also curious about filesystem operations -- directory lookups, for instance, are a place where I'd expect to see overhead; block-level reads, writes and seeks (which the given charts focus heavily on) *aren't*. – Charles Duffy Oct 14 '17 at 12:48
  • 1
    one of the best answers ive ever seen – Muhammad Ali Oct 25 '18 at 12:03
  • 123
    I love charts with the same shade color. It's so easy to distinguish – Viktor Joras May 27 '19 at 10:51
  • 12
    More interested in docker versus native than docker vs any kind of vm. – Joseph Garvin Jun 26 '19 at 23:35
  • [This paper](https://www.diva-portal.org/smash/get/diva2:1252694/FULLTEXT01.pdf) did some tests in a microservices architecture and concluded a decreased performance though. – Stefan Hendriks Jun 03 '21 at 12:30
  • 1
    @StefanHendriks thanks for sharing. Seems to me the field is ready for an in depth comparison of kubernetes networking, the native "docker-specific" nat options, the swarm overlay network, perhaps even whatever nomad and ECS use for networking, service mesh, etc. It's clear that the NAT approaches have a slowdown, would be great to see more attention there – Hamy Sep 10 '21 at 21:08
  • @StefanHendriks I have not read it yet, but this paper looks promising: https://www.researchgate.net/publication/340357513_Performance_analysis_of_container-based_networking_solutions_for_high-performance_computing_cloud – Hamy Sep 10 '21 at 21:14
  • Stack Overflow should pay you for answers like this! You do the work and they make money. – iconoclast Aug 31 '23 at 17:59
166

Docker isn't virtualization, as such -- instead, it's an abstraction on top of the kernel's support for different process namespaces, device namespaces, etc.; one namespace isn't inherently more expensive or inefficient than another, so what actually makes Docker have a performance impact is a matter of what's actually in those namespaces.


Docker's choices in terms of how it configures namespaces for its containers have costs, but those costs are all directly associated with benefits -- you can give them up, but in doing so you also give up the associated benefit:

  • Layered filesystems are expensive -- exactly what the costs are vary with each one (and Docker supports multiple backends), and with your usage patterns (merging multiple large directories, or merging a very deep set of filesystems will be particularly expensive), but they're not free. On the other hand, a great deal of Docker's functionality -- being able to build guests off other guests in a copy-on-write manner, and getting the storage advantages implicit in same -- ride on paying this cost.
  • DNAT gets expensive at scale -- but gives you the benefit of being able to configure your guest's networking independently of your host's and have a convenient interface for forwarding only the ports you want between them. You can replace this with a bridge to a physical interface, but again, lose the benefit.
  • Being able to run each software stack with its dependencies installed in the most convenient manner -- independent of the host's distro, libc, and other library versions -- is a great benefit, but needing to load shared libraries more than once (when their versions differ) has the cost you'd expect.

And so forth. How much these costs actually impact you in your environment -- with your network access patterns, your memory constraints, etc -- is an item for which it's difficult to provide a generic answer.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • 3
    This is a good answer but I'm looking for more specific numbers and benchmarks. I'm familiar with the cost of cgroups but Docker is more than that as you've pointed out. Thanks a lot for the answer. – Luke Hoersten Apr 02 '15 at 21:55
  • 12
    Sure. My point is that any generalized benchmarks you find will be of very limited applicability to any specific application -- but that's not to say I disagree with folks trying to provide them, but merely that they should be taken with a heaping tablespoon of salt. – Charles Duffy Apr 02 '15 at 22:03
  • 1
    In that manner you could say that KVM "is not a virtualization it is simply an abstraction on top of x86 virtual technology calls". – Vad Aug 19 '16 at 09:10
  • 12
    @Vad, there's consensus agreement, going back decades (to IBM's early non-x86 hardware implementations!), that providing abstraction directly on the hardware layer is unambiguously virtualization. Consensus for terminology around kernel-level namespacing is considerably more fragmented -- we could each point to sources favoring our individual views -- but frankly, there are useful technical distinctions (around both security and performance characteristics) that moving to a single term would obscure, so I'm holding my position until and unless contrary industry consensus is reached. – Charles Duffy Oct 06 '16 at 17:02
  • 1
    @LukeHoersten, ...right, it's not the cgroups that have a significant cost, it's much more the contents of the network and filesystem namespaces. But *how much those costs are* depends almost entirely on how Docker is configured -- which specific backends you're using. Bridging is much, *much* cheaper than Docker's default NAT, for example; and the various filesystem backends' performance overhead also varies wildly (and in some cases, the amount of overhead depends on usage patterns; overlayfs variants can be much more expensive with big directories modified through multiple layers f/e). – Charles Duffy Jan 15 '19 at 17:00
31

Here's some more benchmarks for Docker based memcached server versus host native memcached server using Twemperf benchmark tool https://github.com/twitter/twemperf with 5000 connections and 20k connection rate

Connect time overhead for docker based memcached seems to agree with above whitepaper at roughly twice native speed.

Twemperf Docker Memcached

Connection rate: 9817.9 conn/s
Connection time [ms]: avg 341.1 min 73.7 max 396.2 stddev 52.11
Connect time [ms]: avg 55.0 min 1.1 max 103.1 stddev 28.14
Request rate: 83942.7 req/s (0.0 ms/req)
Request size [B]: avg 129.0 min 129.0 max 129.0 stddev 0.00
Response rate: 83942.7 rsp/s (0.0 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 28.6 min 1.2 max 65.0 stddev 0.01
Response time [ms]: p25 24.0 p50 27.0 p75 29.0
Response time [ms]: p95 58.0 p99 62.0 p999 65.0

Twemperf Centmin Mod Memcached

Connection rate: 11419.3 conn/s
Connection time [ms]: avg 200.5 min 0.6 max 263.2 stddev 73.85
Connect time [ms]: avg 26.2 min 0.0 max 53.5 stddev 14.59
Request rate: 114192.6 req/s (0.0 ms/req)
Request size [B]: avg 129.0 min 129.0 max 129.0 stddev 0.00
Response rate: 114192.6 rsp/s (0.0 ms/rsp)
Response size [B]: avg 8.0 min 8.0 max 8.0 stddev 0.00
Response time [ms]: avg 17.4 min 0.0 max 28.8 stddev 0.01
Response time [ms]: p25 12.0 p50 20.0 p75 23.0
Response time [ms]: p95 28.0 p99 28.0 p999 29.0

Here's bencmarks using memtier benchmark tool

memtier_benchmark docker Memcached

4         Threads
50        Connections per thread
10000     Requests per thread
Type        Ops/sec     Hits/sec   Misses/sec      Latency       KB/sec
------------------------------------------------------------------------
Sets       16821.99          ---          ---      1.12600      2271.79
Gets      168035.07    159636.00      8399.07      1.12000     23884.00
Totals    184857.06    159636.00      8399.07      1.12100     26155.79

memtier_benchmark Centmin Mod Memcached

4         Threads
50        Connections per thread
10000     Requests per thread
Type        Ops/sec     Hits/sec   Misses/sec      Latency       KB/sec
------------------------------------------------------------------------
Sets       28468.13          ---          ---      0.62300      3844.59
Gets      284368.51    266547.14     17821.36      0.62200     39964.31
Totals    312836.64    266547.14     17821.36      0.62200     43808.90
p4guru
  • 1,400
  • 2
  • 19
  • 25
  • 2
    They compare two different builds of memcached, and also one of them in docker, other outside of docker, aren't they? – san Sep 26 '15 at 21:31
  • 5
    Are these results with host networking or bridge networking in docker? – akaHuman Jan 11 '16 at 07:31
  • 27
    With such big stddevs these measurements do not show any representable data `avg 200.5 min 0.6 max 263.2 stddev 73.85` – Sergey Zhukov Sep 24 '16 at 17:37
1

Runtime Librairies Comparison

I'm going to approach the question about runtime performance cost of a container in respect to runtime libraries.

Speed: Musl vs. glibc

In Apline Linux containers, the runtime libraries are provided by Musl in lieu of glibc, and according to the below link, there can be a performance difference between the two:

https://www.etalabs.net/compare_libcs.html

I've read various opinions researching this topic that being both tiny & significantly more modern, Musl also confers some degree of greater security over glibc. Haven't been able to locate any data to support these views however.

Compatibility

Even were Musl faster & more secure, it can however present compatibility issues because Musl is materially different than glibc. I find though if I'm creating a docker image using apk to pull in my packages, of course there's no capability issues.

Conclusion

If performance matters, cut (2) containers one Alpine Linux with Musl and another using a distro that uses glibc and benchmark them. And of course post your results in the comments!!!!

F1Linux
  • 3,580
  • 3
  • 25
  • 24