1

I am trying to create a very simple Docker application that includes a persistent SQL database and a (R) script that adds data to the database at regular intervals (but can be run more often, if needed). I am new to multi-container applications, but it's simple enough to run on-demand: I have an SQL database container and the (R-based) script container that connects to it. But what is the best way to schedule calls to the script?

I could create a cron job inside the container housing the script and run it repeatedly that way, but this feels like it might be violating the "one process per container" principle, and it wouldn't be easily scalable if I made things more complex. I could also run cron on my host, but that feels wrong. Other websites like this suggest creating a separate, persistent container just to coordinate cron jobs.

But if the job that I want run is IN a dockerized container itself, what's the best way to accomplish this? Can a cron container issue a docker run command on the "sleeping" script container? If possible, I assume it's best to only have containers running when you actually need them.

Lastly, would this all be able to be written into a docker-compose file?

I've successfully used a cron scheduler INSIDE the container housing the R script and OUTSIDE of all the containers as part of my host's crontab, but my research suggests these are bad ways to do it.

Andrew
  • 23
  • 6
  • [How to run a cron job inside a docker container?](https://stackoverflow.com/questions/37458287/how-to-run-a-cron-job-inside-a-docker-container) includes several sample setups; does one of them meet your needs? – David Maze Jan 18 '23 at 23:41
  • Not really? I guess ideally I imagined that the best way to do this would include having a `cron` container trigger containerized tasks that don't include an enduring main process. But from what I'm seeing on these posts, most people have the `cron` container also run all the scripts on the same container? That seems to violate the "one-process" principle though, so I'm confused. – Andrew Jan 19 '23 at 16:13
  • The conflict seems to be [referenced here](https://stackoverflow.com/questions/39468841/is-it-possible-to-start-a-stopped-container-from-another-container). On one hand, I read that Docker containers should be simple and exit after they run. On the other hand, they say, "... Docker is really meant for long running services ... So it's best to build your system so that containers are up and stay up." I guess this is the tension that currently has me confused! – Andrew Jan 19 '23 at 16:21

1 Answers1

1

I also think that scheduled jobs should run in their own containers if possible.

You can make a super simple container scheduler using docker-in-docker and cron like this:

Dockerfile

FROM docker:dind
COPY crontab .
CMD crontab crontab && crond -f

example crontab file

* * * * * docker run -d hello-world

build and run with

docker build -t scheduler .
docker run -d -v /var/run/docker.sock:/var/run/docker.sock scheduler

It'll then run the hello-world image once every minute.

Hans Kilian
  • 18,948
  • 1
  • 26
  • 35
  • 1
    I haven't even heard of Docker-in-Docker before! Is there any reason *not* to use it? Like, does it require more resources, add unneeded complexity, etc? The recursive nature of `dind` worries me for some hard-to-explain reason... – Andrew Jan 19 '23 at 16:09
  • 1
    It's not recursive. Docker-in-docker just means that the image has the docker CLI installed. The mapping of the docker socket gives it access to the docker daemon on the host. So there's only one docker daemon and when you schedule a container this way, the containers are run on the host. – Hans Kilian Jan 19 '23 at 16:35