docker-compose and graceful Celery shutdown

Question

I've been wondering about and searching for solutions for this and I didn't find any.

I'm running Celery in a container built with docker-compose. My container is configured like this:

celery:
  build: .
  container_name: cl01
  env_file: ./config/variables.env
  entrypoint:
    - /celery-entrypoint.sh
  volumes:
    - ./django:/django
  depends_on:
    - web
    - db
    - redis
  stop_grace_period: 1m

And my entrypoint script looks like this:

#!/bin/sh
# Wait for django
sleep 10
su -m dockeruser -c "celery -A myapp worker -l INFO"

Now, if I run docker-compose stop, I would like to have a warm (graceful) shutdown, giving Celery the provided 1 minute (stop_grace_period) to finish already started tasks. However docker-compose stop seems to kill Celery straight away. Celery should also log that it is asked to shut down gracefully, but I don't see anything but an abrupt stop to my task logs.

What am I doing wrong or what do I need to change to make Celery shut down gracefully?

edit: Suggested answer below about providing the --timeout parameter to docker-compose stop does not solve my issue.

score 7 · Answer 1 · answered Jun 26 '19 at 09:09

You need to mark celery process with exec, this way celery process will have the same ID as docker command and docker will be able to send a SIGTERM signal to it and gracefully close celery process.

# should be the last command in script
exec celery -A myapp worker -l INFO

score 1 · Answer 2 · answered Apr 07 '17 at 12:11

1

Via docs

Usage: stop [options] [SERVICE...]

Options:
-t, --timeout TIMEOUT      Specify a shutdown timeout in seconds (default: 10).

Try with timeout set to 60 seconds at least.

answered Apr 07 '17 at 12:11

Mateusz Knapczyk

276
1
2
15

Doesn't seem to work. Funnily enough, Celery, Celery Beat and Flower are the first things that are shut down within seconds. nginx, django, redis etc all take their time and close after 60 seconds.. And a question: I thought this was exactly what the `stop_grace_period` parameter in the docker-compose.yml was for? So you wouldn't need to set the timeout when running the bash command. – JanMensch Apr 07 '17 at 12:46
Well, yes, but you set it only for celery service, when setting timeout while running `docker-compose stop` should work for all containers. I don't know why it shuts down that way though. – Mateusz Knapczyk Apr 07 '17 at 13:54

Taras Mykhalchuk · Answer 3 · 2022-12-01T13:41:23.687

My experience implementing graceful shutdown for celery workers spawned by supervisord inside a docker container.

Supervisord part

supervisord.conf

...
[supervisord]
    ...
    nodaemon=true     # run supervisord in the foreground
[include]
    files=celery.conf # path to the celery config file

Set nodaemon=true so that we can start it as a background process from the entrypoint script later.

celery.conf

[group:celery_workers] 
    programs=one, two
[program:one]
    ...
    command=celery -A backend --config=celery.py worker -n worker_one --pidfile=/var/log/celery/worker_one.pid --pool=gevent --concurrency=10 --loglevel=INFO
    killasgroup=true
    stopasgroup=true
    stopsignal=TERM
    stopwaitsecs=600
[program:two]
    ...
    # similar to the previous one

The configuration file above is responsible for starting a group of workers each running in a separate process within a group. I'd like to stop on a stopwaitsecs section value. Let's see what the documentation tells us about it:

This parameter sets the number of seconds to wait for the OS to return a SIGCHLD to supervisord after the program has been sent a stopsignal. If this number of seconds elapses before supervisord receives a SIGCHLD from the process, supervisord will attempt to kill it with a final SIGKILL.

If stopwaitsecs>stop_grace_period specified for your service in a docker-compose file then you'll be getting SIGKILL from your docker. Make sure stopwaitsecs<stop_grace_period, otherwise all running tasks get interrupted by docker.

Entrypoint script part

entrypoint.sh

#!/bin/bash

# safety switch, exit script if there's error.
set -e

on_close(){
    echo "Signal caught..."
    echo "Supervisor is stopping processes gracefully..."

    # cleanup all pid files
    rm worker_one.pid
    rm worker_two.pid

    supervisorctl stop celery_workers:

    echo "All processes have been stopped. Exiting..."
    exit 1
}

start_supervisord(){
    supervisord -c /etc/supervisor/supervisord.conf
}

# start trapping signals (docker sends `SIGTERM` for shutdown)
trap on_close SIGINT SIGTERM SIGKILL

start_supervisord &   # start supervisord in a background
SUPERVISORD_PID=$!    # PID of the last background process started
wait $SUPERVISORD_PID
EXIT_STATUS=$?        # the exit status of the last command executed

The script above consists of:

registering a cleanup function on_close
starting supervisord's process group in a background
registering the last background process's PID and waiting for it to finish

Docker part

docker-compose.yml

...
services:
  celery:
    ...
    stop_grace_period: 15m30s
    entrypoint: [/entrypoints/entrypoint.sh]

The only setting worth mentioning here is entrypoint form declaration. In our case better to use exec form. It starts an executable script in a process with PID 1 and doesn't create any subprocesses as shell form does. SIGTERM from docker stop <container> gets propagated to an executable which traps it and performs all cleaning and closing logic.

score -4 · Answer 4 · edited May 23 '19 at 19:51

-4

Try using this:

docker-compose down

edited May 23 '19 at 19:51

Zoe

27,060
21
118
148

answered May 23 '19 at 19:47

Guest

1

Posts here are required to be in English. I "translated" (very roughly anyway) it for you this time because it was such a small part of the answer (and the rest appears to be useful) - but please keep that in mind next time you post an answer, or a question for that matter. – Zoe May 23 '19 at 19:51
1

Try to explain more clearly why this is an answer to the question. – Jeroen Heier May 23 '19 at 20:14

docker-compose and graceful Celery shutdown

4 Answers4

Supervisord part

Entrypoint script part

Docker part

Linked