6

docker data volume vs mounted host directory

says volumes should be preferred over bind mounts

I have a few questions regarding the issue. The post says:

When you create a volume, it is stored within a directory on the Docker host

Bear with me, but I'm new to docker, and I'm wondering what is docker host here.

Is it a machine where I build the image (probably not)?
Is it the machine where the image will be run? If it is so, what happens if I run the image on multiple machines, will it create two independent volumes?
When I have developement and production setup, how docker manages two separate volumes for each environment?

Besides it seems fairly easy to lose data by doing docker-compose down when I use data volumes, that's the first obstacle that makes me to hesitate to use data volumes, is there an obvious solution to mitigate the issue?

eugene
  • 39,839
  • 68
  • 255
  • 489

2 Answers2

3

That's not a doctrine actually - not using bind mounts. Yes, they can damage your host's file system if mounted inaccurately (like -v /bin:/var/log) as soon as your have root privileges inside container by default; also they are less portable but they facilitate file exchange between host and container. When you want to provide initial configuration for your service, or put source code for compilation into container, I believe you would prefer to bind mount instead of creating and running temporary container for docker volume cp operations. Also, you should always use :ro option when possible (read only) to prevent data modification from inside container.

Docker host - it is a machine (PC), where Docker daemon is running.

Is it a machine where I build the image (probably not)?

Not true. You can build using docker CLI or docker API remotely.

Is it the machine where the image will be run?

Yes, images are run by docker daemon and thus it will be the host.

If it is so, what happens if I run the image on multiple machines, will it create two independent volumes?

It depends. Running images on different machines can be achieved in different ways, staring with orchestrators like kubernetes or docker swarm and ending with manual launch on separate docker daemons. With orchestrators it is possible to have same volume, shared among different hosts, but in this case you can't use bind mounts, you use volumes.

When I have developement and production setup, how docker manages two separate volumes for each environment?

Docker doesn't it is you who manages.

Besides it seems fairly easy to lose data by doing docker-compose down when I use data volumes, that's the first obstacle that makes me to hesitate to use data volumes, is there an obvious solution to mitigate the issue?

Volumes can easily persist between docker-compose sessions. The most explicit way to achieve that is to declare volume in advance with

docker volume create foo

and then use it in your compose files:

version: '3'
services:
  abc:
    volumes:
      foo:/foo
volumes:
  foo:
    external: true
grapes
  • 8,185
  • 1
  • 19
  • 31
1
Feature Bind Volume                                 
Internal soul Bind mounts attach a user-specified location on host filesystem to a specific point in a container file tree. Volume attach with disk storage on the host filesystem or cloud storage.
command --mount type=bind,src="",dst="" Docker CLI
docker volume command
Dependency dependent on location on to the host filesystem. Container-independent data management
Separation of concerns No Yes
Conflict with other containers Yes Example: multiple instances of Cassandra that all use the same host location as a bind mount for data storage. In that case, each of the instances would compete for the same set of files. Without other tools such as file locks, that would likely result in corruption of the database. No. By default, Docker creates volumes by using the local volume plugin.
When to choose 1- Bind mounts are useful when the host provides a file or directory that is needed by a program running in a container, or when that containerized program produces a file or log that is processed by users or programs running outside containers.
2- appropriate tools for workstations, machines with specialized concerns
3- systems with more traditional configuration management tooling.
Working with Persistent storage
1. Databases
2. Cloud storage
When not to choose Better to avoid these kinds of specific bindings in generalized platforms or hardware pools. To be written
Gupta
  • 8,882
  • 4
  • 49
  • 59