I use docker and docker compose to package scientific tools into easily/universally executable modules. One example is a docker that packages a rather complicated python library into a container that runs a jupyter notebook server; the idea is that other scientists who are not terribly tech-savvy can clone a github repository, run docker-compose up
then do their analyses without having to install the library, configure various plugins and other dependencies, etc.
I have this all working fine except that I'm having issues getting the volume mounts to work in a coherent fashion. The reason for this is that the library inside the docker container handles multiple kinds of datasets, which users will store in several separate directories that are conventionally tracked through shell environment variables. (Please don't tell me this is a bad way to do this--it's the way things are done in the field, not the way I've chosen to do things.) So, for example, if the user stores FreeSurfer data, they will have an environment variable named SUBJECTS_DIR that points to the directory containing the data; if they store HCP data, they will have an environment variable HCP_SUBJECTS_DIR. However, they may have both, either, or neither of these set (as well as a few others).
I would like to be able to put something like this in my docker-compose.yml file in order to handle these cases:
version: '3'
services:
my_fancy_library:
build: .
ports:
- "8080:8888"
environment:
- HCP_SUBJECTS_DIR="/hcp_subjects"
- SUBJECTS_DIR="/freesurfer_subjects"
volumes:
- "$SUBJECTS_DIR:/freesurfer_subjects"
- "$HCP_SUBJECTS_DIR:/hcp_subjects"
In testing this, if the user has both environment variables set, everything works swimmingly. However, if they don't have one of these set, I get an error about not mounting directories that are fewer than 2 characters long (which I interpret to be a complaint about mounting a volume specified by ":/hcp_subjects").
This question asks basically the same thing, and the answer points to here, which, if I'm understanding it right, basically explains how to have multiple docker-compose files that are resolved in some fashion. This isn't really a viable solution for my case for a few reasons:
- This tool is designed for use by people who don't necessarily know anything about docker, docker-compose, or related utilities, so expecting them to write/edit their own docker-compose.yml file is a problem
- There are more than just two of these directories (I have shown two as an example) and I can't realistically make a docker-compose file for every possible combination of these paths being declared or not declared
- Honestly, this solution seems really clunky given that the information needed is right there in the variables that docker-compose is already reading.
The only decent solution I've been able to come up with is to ask the users to run a script ./run.sh
instead of docker-compose up
; the script examines the environment variables, writes out its own docker-compose.yml
file with the appropriate volumes, and runs docker-compose up
itself. This also seems somewhat clunky, but it works.
Does anyone know of a way to conditionally mount a set of volumes based on the state of the environment variables when docker-compose up
is run?