How should I build a docker image for a development database?

Question

I have a development database which I would like to commit inside a docker image, to be pushed to a private repository and used in local development and CI builds.

The database is saved as a SQL backup, and I can successfully restore it to a MariaDB container by mapping the backup file into the official image's /docker-entrypoint-initdb.d/ directory, which will execute this backup file the first time a container is run. When this happens, the database is restored by MariaDB into datafiles located in /var/lib/mysql inside the container.

Theoretically, I could then stop the container, commit it to a new image, push it, and I'm done. However, MariaDB's Dockerfile declares VOLUME /var/lib/mysql, which means that as soon as the container is created, this path will be mapped to an automatic named volume on the host, even if --volume is not used to explicitly mount at that path. Since the path is mounted from the host, the restored database will not be captured by docker commit. It does not appear possible to override this automatic volume mount.

One workaround is to docker create a new container without running it, docker cp my database backup into /docker-entrypoint-initdb.d/, and then docker commit this to a new image. This image can be pushed since it contains the backup data, and when it is first run, MariaDB will restore it into the database. However, you have to wait for the restore to complete whenever the container is created, instead of the image shipping with the database already restored.

Another workaround is to fork the MariaDB Dockerfile to remove the VOLUME directive, or create my own by installing MariaDB myself on a base image.

A third option might be to create my own Dockerfile to build from mariadb as a base image, and attempt to do the restore as part of that build process, but I haven't yet fully investigated this.

What is the most effective approach to accomplish this objective?

`VOLUME` in Dockerfile looking more and more like an antipattern: https://boxboat.com/2017/01/23/volumes-and-dockerfiles-dont-mix/ — NReilingh, Oct 08 '21 at 16:59
This might be the way: https://stackoverflow.com/questions/42316614/how-can-i-edit-an-existing-docker-image-metadata — NReilingh, Oct 08 '21 at 17:05

score 3 · Answer 1 · answered Oct 09 '21 at 13:40

I've found my way to the following approach:

Volume definitions are stored in image metadata, so it is possible to manually alter this metadata in order to remove VOLUME declarations and save the result to a new image. This process has been encapuslated in a helpful Python package called docker-copyedit.

$ docker pull mariadb
$ pip install docker-copyedit
$ docker-copyedit.py FROM mariadb INTO mariadb:novolumes REMOVE ALL VOLUMES

Now that I have a local :novolumes image of MariaDB, I can automate the database inside of a Dockerfile build. In order to use the image's built-in functionality for creating a new database and restoring from the /docker-entrypoint-initdb.d/ directory, I created a bash script which sources docker-entrypoint.sh and then executes the relevant portions of _main() to do the restore and then exit without executing _main() directly (which would cause the database to be restarted so that the build never exits). Then I use a second build stage to import just the /var/lib/mysql directory from the restore stage.

Dockerfile

FROM mariadb:novolumes AS restorer

ENV MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=yes

COPY data/source/database.sql /docker-entrypoint-initdb.d/

COPY restore_db.sh restore_db.sh
RUN ["./restore_db.sh", "mysqld"]

FROM mariadb:novolumes

COPY --from=restorer /var/lib/mysql /var/lib/mysql

restore_db.sh

#!/bin/bash

# Import helper functions
source docker-entrypoint.sh

# Direct copy-paste of lines 356-392 from docker-entrypoint.sh:

mysql_note "Entrypoint script for MariaDB Server ${MARIADB_VERSION} started."

mysql_check_config "$@"
# Load various environment variables
docker_setup_env "$@"
docker_create_db_directories

# If container is started as root user, restart as dedicated mysql user
if [ "$(id -u)" = "0" ]; then
  mysql_note "Switching to dedicated user 'mysql'"
  exec gosu mysql "$BASH_SOURCE" "$@"
fi

# there's no database, so it needs to be initialized
if [ -z "$DATABASE_ALREADY_EXISTS" ]; then
  docker_verify_minimum_env

  # check dir permissions to reduce likelihood of half-initialized database
  ls /docker-entrypoint-initdb.d/ > /dev/null

  docker_init_database_dir "$@"

  mysql_note "Starting temporary server"
  docker_temp_server_start "$@"
  mysql_note "Temporary server started."

  docker_setup_db
  docker_process_init_files /docker-entrypoint-initdb.d/*

  mysql_note "Stopping temporary server"
  docker_temp_server_stop
  mysql_note "Temporary server stopped"

  echo
  mysql_note "MariaDB init process done. Ready for start up."
  echo
fi

Finally, I can do the build and run the container to quickly create a local development database, and/or push the image to a repository.

$ docker build -t NReilingh/devdb .
$ docker run --rm -d NReilingh/devdb

How should I build a docker image for a development database?

1 Answers1

Dockerfile

restore_db.sh