51

While issuing a new build to update code in workers how do I restart celery workers gracefully?

Edit: What I intend to do is to something like this.

  • Worker is running, probably uploading a 100 MB file to S3
  • A new build comes
  • Worker code has changes
  • Build script fires signal to the Worker(s)
  • Starts new workers with the new code
  • Worker(s) who got the signal after finishing the existing job exit.
Quintin Par
  • 15,862
  • 27
  • 93
  • 146

9 Answers9

51

According to https://docs.celeryq.dev/en/stable/userguide/workers.html#restarting-the-worker you can restart a worker by sending a HUP signal

 ps auxww | grep celeryd | grep -v "grep" | awk '{print $2}' | xargs kill -HUP
armonge
  • 3,108
  • 20
  • 36
  • 2
    `sudo ps auxww | grep celeryd | grep -v "grep" | awk '{print $2}' | sudo xargs kill -HUP` exclude grep :-) – Quintin Par Mar 24 '12 at 07:59
  • 5
    You can replace grep celeryd | grep -v "grep" with grep [c]eleryd. Just saying. – chanux Oct 11 '13 at 07:42
  • 4
    It seems that it is not a graceful restart, is it? As the docs say: "Other than stopping then starting the worker to restart, you can also restart the worker using the HUP signal, but note that the worker will be responsible for restarting itself so this is prone to problems and is not recommended in production" So what is the best way to reload Celery in production to avoid failures? – mennanov Oct 08 '14 at 11:57
  • 2
    For celery multi: "For production deployments you should be using init scripts or other process supervision systems". As for HUP: "this is prone to problems and **is not recommended in production**" – webjunkie Aug 02 '16 at 10:12
  • 1
    The celery documentation appears to be self-conflicting on this subject; here it says don't use `celery multi` in production, but in the daemonization section the example systemd config file uses `celery multi`. – GDorn Jan 11 '21 at 17:27
15
celery multi start 1 -A proj -l info -c4 --pidfile=/var/run/celery/%n.pid
celery multi restart 1 --pidfile=/var/run/celery/%n.pid

http://docs.celeryproject.org/en/latest/userguide/workers.html#restarting-the-worker

zengr
  • 38,346
  • 37
  • 130
  • 192
  • 5
    Uugh... it says right there "The easiest way to manage workers for *development* is by using celery multi. For **production deployments** you should be using *init scripts or other process supervision systems*". This answer does not apply to running in production! – webjunkie Aug 02 '16 at 10:09
  • 1
    @webjunkie The OP didn't say "in product deployment", so not sure why would you downvote it if it was not mentioned in the original question. – zengr Aug 02 '16 at 17:22
  • 3
    He also did not say he wants a solution for an e.g. testing environment. A lot of people will not bother to read more and dangerously go and use a solution that appears right to them. So it is only fair to mention drawbacks and not simply copy and paste something from a documentation ignoring notes and stripping away further recommendations. – webjunkie Aug 03 '16 at 10:34
8

You can do:

celery multi restart w1 -A your_project -l info  # restart workers

Example

Zulu
  • 8,765
  • 9
  • 49
  • 56
cảnh nguyễn
  • 528
  • 1
  • 6
  • 6
6

If you're going the kill route, pgrep to the rescue:

kill -9 `pgrep -f celeryd`

Mind you, this is not a long-running task and I don't care if it terminates brutally. Just reloading new code during dev. I'd go the restart service route if it was more sensitive.

JL Peyret
  • 10,917
  • 2
  • 54
  • 73
  • 4
    (pkill does this in a cleaner way) – orzel Jun 27 '16 at 20:27
  • didn't know that. I still prefer seeing a list of processes that will be killed beforehand however: step 1 - tune your pgrep, step 2 weaponize it by feeding it to the kill. – JL Peyret May 23 '18 at 04:34
2

What should happen to long running tasks? I like it this way: long running tasks should do their job. Don't interrupt them, only new tasks should get the new code.

But this is not possible at the moment: https://groups.google.com/d/msg/celery-users/uTalKMszT2Q/-MHleIY7WaIJ

guettli
  • 25,042
  • 81
  • 346
  • 663
2

I have repeatedly tested the -HUP solution using an automated script, but find that about 5% of the time, the worker stops picking up new jobs after being restarted.

A more reliable solution is:

stop <celery_service>
start <celery_service>

which I have used hundreds of times now without any issues.

From within Python, you can run:

import subprocess
service_name = 'celery_service'
for command in ['stop', 'start']:
    subprocess.check_call(command + ' ' + service_name, shell=True)
crunk_monad
  • 43
  • 1
  • 8
2

You should look at Celery's autoreloading

sebasmagri
  • 170
  • 2
  • 10
0

If you're using docker/docker-compose and putting celery into a separate container from the Django container, you can use

docker-compose kill -s HUP celery

, where celery is the container name. The worker will be gracefully restarted and the ongoing task is not brutally stopped.

Tried pkill, kill, celery multi stop, celery multi restart, docker-compose restart. All not working. Either the container is stopped abruptly or the code is not reloaded.

I just want to reload my code in the prod server manually with a 1-liner. Don't want to play with daemonization.

Benny Chan
  • 730
  • 1
  • 8
  • 24
-3

Might be late to the party. I use:

sudo systemctl stop celery

sudo systemctl start celery

sudo systemctl status celery

Denis Kanygin
  • 1,125
  • 12
  • 15