7

I have read that there is a significant hit to performance when mounting shared volumes on windows. How does this compared to only having say the postgres DB inside of a docker volume (not shared with host OS) or the rate of reading/writing from/to flat files?

Has anyone found any concrete numbers around this? I think even a 4x slowdown would be acceptable for my usecase if it is only for disc IO performance... I get the impression that mounted + shared volumes are significantly slower on windows... so I want to know if foregoing this sharing component help improve matters into an acceptable range.

Also if I left Postgres on bare metal can all of my docker apps access Postgres still that way? (That's probably preferred I would imagine - I have seen reports of 4x faster read/write staying bare metal) - but I still need to know... because my apps deal with lots of copy / read / moving of flat files as well... so need to know what is best for that.

For example, if shared volumes are really bad vs keeping it only on the container, then I have options to push files over the network to avoid the need for a shared mounted volume as a bottleneck...

Thanks for any insights

AustEcon
  • 122
  • 1
  • 3
  • 8

1 Answers1

6

You only pay this performance cost for bind-mounted host directories. Named Docker volumes or the Docker container filesystem will be much faster. The standard Docker Hub database images are configured to always use a volume for storage, so you should use a named volume for this case.

docker volume create pgdata
docker run -v pgdata:/var/lib/postgresql/data -p 5432:5432 postgres:12

You can also run PostgreSQL directly on the host. On systems using the Docker Desktop application you can access it via the special hostname host.docker.internal. This is discussed at length in From inside of a Docker container, how do I connect to the localhost of the machine?.

If you're using the Docker Desktop application, and you're using volumes for:

  • Opaque database storage, like the PostgreSQL data: use a named volume; it will be faster and you can't usefully directly access the data even if you did have it on the host
  • Injecting individual config files: use a bind mount; these are usually only read once at startup so there's not much of a performance cost
  • Exporting log files: use a bind mount; if there is enough log I/O to be a performance problem you're probably actively debugging
  • Your application source code: don't use a volume at all, run the code that's in the image, or use a native host development environment
David Maze
  • 130,717
  • 29
  • 175
  • 215
  • 1
    thanks. Do you have any experience with the difference in network IO between say 8 different microservices all running as docker containers (specifically on Windows)? - Will it be very slow compared to running the microservices on bare metal? I could maybe cope with 50% speed drop but if it's 90% loss... probably unacceptable. I expect some increase in latency (maybe double?) but I'm talking about bulk throughput (MB/sec). It's hard to find reliable numbers on these things... – AustEcon Jun 22 '20 at 13:39
  • 1
    has anybody ran any benchmarks? running postgres on the host versus in Docker gives me a 20% performance gain according to pgbench. – Brandon Ros Sep 04 '20 at 23:13