Check if celery beat is up and running

Question

In my Django project, I use Celery and Rabbitmq to run tasks in background. I am using celery beat scheduler to run periodic tasks. How can i check if celery beat is up and running, programmatically?

Why do you want to do that? Do you want to check before sending a task? — Timmy O'Mahony, Feb 12 '16 at 10:35
One reason would be to start failover if celerybeat were not running. There doesn't appear to be an accepted way to accomplish that. — Jonathan Richards, Jul 31 '16 at 10:07
It's a genuine problem. We recently had an outage in one of our modules, which relies on celery beat to run important crons. Turned out, last friday, there was a temporary network outage on few servers, one of which hosted our rabbitmq installation. It was recovered within an hour. But celery beat, as it can't do any reconnection, caused the whole system to be practically `down` till Monday morning. Even I am looking for a reliable way to monitor it now. Found this, but python 2.7 is not supported in latest version https://github.com/KristianOellegaard/django-health-check#supported-versions — Tushar, Sep 12 '18 at 19:40

panchicore · Answer 1 · 2018-07-31T12:22:07.737

Make a task to HTTP requests to a Ping URL at regular intervals. When the URL is not pinged on time, the URL monitor will send you an alert.

import requests
from yourapp.celery_config import app

@app.task
def ping():
    print '[healthcheck] pinging alive status...'
    # healthchecks.io works for me:
    requests.post("https://hchk.io/6466681c-7708-4423-adf0-XXXXXXXXX")

This celery periodic task is scheduled to run every minute, if it doesn't hit the ping, your beat service is down*, the monitor will kick in your mail (or webhook so you can zapier it to get mobile push notifications as well).

celery -A yourapp.celery_config beat -S djcelery.schedulers.DatabaseScheduler

*or overwhelmed, you should track tasks saturation, this is a nightmare with Celery and should be detected and addressed properly, happens frequently when the workers are busy with blocking tasks that would need optimization

This is wrong, because the worker could be down at the time. — Kryštof Řeháček, Apr 26 '19 at 07:43
If you mean "hchk.io" could be down, it's true, then rely on an aws-like lambda function for example. If you mean the celery "worker" not wrong because if the worker is dead the ping service will be waiting for him to say "I'm alive", else "it's dead", that's the point of this kind of service. — panchicore, May 13 '19 at 18:35

score 3 · Answer 2 · answered Jul 26 '17 at 12:17

3

If you have daemonized celery following the tutorial of the celery doc, checking if it's running or not can be done through

sudo /etc/init.d/celeryd status
sudo /etc/init.d/celerybeat status

You can use the return of such commands in a python module.

answered Jul 26 '17 at 12:17

jean pierre huart

370
1
12

score 2 · Answer 3 · answered Nov 30 '16 at 09:13

Are you use upstart or supervison or something else to run celery workers + celery beat as a background tasks? In production you should use one of them to run celery workers + celery beat in background.

Simplest way to check celery beat is running: ps aux | grep -i '[c]elerybeat'. If you get text string with pid it's running. Also you can make output of this command more pretty: ps aux | grep -i '[c]elerybeat' | awk '{print $2}'. If you get number - it's working, if you get nothing - it's not working.

Also you can check celery workers status: celery -A projectname status.

If you intrested in advanced celery monitoring you can read official documentation monitoring guide.

That's not programatically though – gdvalderrama Feb 10 '17 at 13:07 — gdvalderrama, Feb 10 '17 at 13:07

score 2 · Answer 4 · answered Jan 31 '18 at 12:58

You can probably look up supervisor. It provides a celerybeat conf which logs everything related to beat in /var/log/celery/beat.log.

Another way of going about this is to use Flower. You can set it up for your server (make sure its password protected), it somewhat becomes easier to notice in the GUI the tasks which are being queued and what time they are queued thus verifying if your beat is running fine.

Tushar · Answer 5 · 2018-11-09T16:34:27.277

I have recently used a solution similar to what @panchicore suggested, for the same problem.

Problem in my workplace was an important system working with celery beat, and once in a while, either due to RabbitMQ outage, or some connectivity issue between our servers and RabbitMQ server, due to which celery beat just stopped triggering crons anymore, unless restarted.

As we didn't have any tool handy, to monitor keep alive calls sent over HTTP, we have used statsd for the same purpose. There's a counter incremented on statsd server every minute(done by a celery task), and then we setup email & slack channel alerts on the grafana metrics. (no updates for 10 minutes == outage)

I understand it's not purely a programatic approach, but any production level monitoring/alerting isn't complete without a separate monitoring entity.

The programming part is as simple as it can be. A tiny celery task running every minute.

@periodic_task(run_every=timedelta(minutes=1))
def update_keep_alive(self):
    logger.info("running keep alive task")
    statsd.incr(statsd_tags.CELERY_BEAT_ALIVE)

A problem that I have faced with this approach, is due to STATSD packet losses over UDP. So use TCP connection to STATSD for this purpose, if possible.

score 1 · Answer 6 · answered Mar 21 '16 at 06:50

1

You can check scheduler running or not by the following command

python manage.py celery worker --beat

answered Mar 21 '16 at 06:50

anjaneyulubatta505

10,713
1
52
62

@MohammedShareefC did you install `celery`? and did you add `celery` to installed apps ? – anjaneyulubatta505 Mar 12 '18 at 11:29
@MohammedShareefC you can check list of available commands by running `python manage.py` – anjaneyulubatta505 Mar 12 '18 at 11:29
I didn't install djcelery app. It solved this issue. But when i installed it, it starts giving `cannot import models` error pointing at a line in my project code. This happens only when I use celery. – Mohammed Shareef C Mar 12 '18 at 14:25

score 1 · Answer 7 · answered Aug 03 '20 at 15:42

While working on a project recently, I used this:

HEALTHCHECK CMD ["stat celerybeat.pid || exit 1"]

Essentially, the beat process writes a pid file under some location (usually the home location), all you have to do is to get some stats to check if the file is there.

Note: This worked while launching a standalone celery beta process in a Docker container

score 1 · Answer 8 · answered Jul 28 '21 at 08:07

The goal of liveness for celery beat/scheduler is to check if the celery beat/scheduler is able to send the job to the message broker so that it can be picked up by the respective consumer. [Is it still working or in a hung state]. The celery worker and celery scheduler/beat may or may not be running in the same pod or instance.

To handle such scenarios, we can create a method update_scheduler_liveness with decorator @after_task_publish.connect which will be called every time when the scheduler successfully publishes the message/task to the message broker. The method update_scheduler_liveness will update the current timestamp to a file every time when the task is published successfully. In Liveness probe, we need to check the last updated timestamp of the file either using: stat --printf="%Y" celery_beat_schedule_liveness.stat command or we can explicitly try to read the file (read mode) and extract the timestamp and compare if the the timestamp is recent or not based on the liveness probe criteria. In this approach, the more minute liveness criteria you need, the more frequent a job must be triggered from the celery beat. So, for those cases, where the frequency between jobs is pretty huge, a custom/dedicated liveness heartbeat job can be scheduled every 2-5 mins and the consumer can just process it. @after_task_publish.connect decorator provides multiple arguments that can be also used for filtering of liveness specific job that were triggered

If we don't want to go for file based approach, then we can rely on Redis like data-source with instance specific redis key as well which needs to be implemented on the same lines.

Check if celery beat is up and running

8 Answers8