0

I want to run this task for every three minutes and this is what I have

tasks.py

@shared_task
def save_hackathon_to_db():
    logger.info('ran')
    loop = asyncio.get_event_loop()
    statuses = ['ended', 'open', 'upcoming']
    loop.run_until_complete(send_data(statuses))
    logger.info('ended')

settings.py

CELERY_BEAT_SCHEDULE = {
    "devpost_api_task": {
        "task": "backend.tasks.save_hackathon_to_db",
        "schedule": crontab(minute="*/3"),
    },
}

However this doesn't get run every 3 minutes, it only gets run when I send a post request to http://0.0.0.0:8000/hackathon/task followed by a get request to http://0.0.0.0:8000/hackathon/<task_id>

This is the code for the functions pertaining to the routes respectively

@csrf_exempt
def run_task(request):
    if request.method == 'POST':
        task = save_hackathon_to_db.delay()
        return JsonResponse({"task_id": task.id}, status=202)


@csrf_exempt
def get_status(request, task_id):
    print(task_id)
    task_result = AsyncResult(task_id)
    result = {
        "task_id": task_id,
        "task_status": task_result.status,
        "task_result": task_result.result
    }
    return JsonResponse(result, status=200)

docker-compose.yml

version: "3.9"

services:
  backend:
    build: ./backend
    ports:
      - "8000:8000"
    command: python3 manage.py runserver 0.0.0.0:8000
    depends_on:
      - db
      - redis
  celery:
    build: ./backend
    command: celery -A backend worker -l INFO --logfile=logs/celery.log
    volumes:
      - ./backend:/usr/src/app
    depends_on:
      - backend
      - redis
  celery-beat:
    build: ./backend
    command: celery -A backend beat -l info
    volumes:
      - ./backend/:/usr/src/app/
    depends_on:
      - redis
  dashboard:
    build: ./backend
    command: flower -A backend --port=5555 --broker=redis://redis:6379/0
    ports:
      - 5555:5555
    depends_on:
      - backend
      - redis
      - celery
  db:
    image: postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: ${DB_NAME}
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
  redis:
    image: redis:6-alpine
volumes:
  postgres_data:

EDIT #1 I doubt it is possible solution #1 because I do have the producer running

These are the initial logs I get from celery-beat

celery-beat_1  | celery beat v4.4.7 (cliffs) is starting.
celery-beat_1  | ERROR: Pidfile (celerybeat.pid) already exists.
celery-beat_1  | Seems we're already running? (pid: 1)

celery logs

celery_1       | /usr/local/lib/python3.9/site-packages/celery/platforms.py:800: RuntimeWarning: You're running the worker with superuser privileges: this is
celery_1       | absolutely not recommended!
celery_1       | 
celery_1       | Please specify a different user using the --uid option.
celery_1       | 
celery_1       | User information: uid=0 euid=0 gid=0 egid=0
celery_1       | 
celery_1       |   warnings.warn(RuntimeWarning(ROOT_DISCOURAGED.format(
celery_1       |  
celery_1       |  -------------- celery@948692de8c79 v4.4.7 (cliffs)
celery_1       | --- ***** ----- 
celery_1       | -- ******* ---- Linux-5.10.25-linuxkit-x86_64-with 2021-09-10 19:59:53
celery_1       | - *** --- * --- 
celery_1       | - ** ---------- [config]
celery_1       | - ** ---------- .> app:         backend:0x7f6af44a8130
celery_1       | - ** ---------- .> transport:   redis://redis:6379/0
celery_1       | - ** ---------- .> results:     redis://redis:6379/0
celery_1       | - *** --- * --- .> concurrency: 4 (prefork)
celery_1       | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
celery_1       | --- ***** ----- 
celery_1       |  -------------- [queues]
celery_1       |                 .> celery           exchange=celery(direct) key=celery
celery_1       |                 
celery_1       | 
celery_1       | [tasks]
celery_1       |   . backend.tasks.save_hackathon_to_db
celery_1       | 
2021-09-10 19:59:53,949: INFO/MainProcess] Connected to redis://redis:6379/0
[2021-09-10 19:59:53,960: INFO/MainProcess] mingle: searching for neighbors
[2021-09-10 19:59:54,988: INFO/MainProcess] mingle: all alone
[2021-09-10 19:59:55,019: WARNING/MainProcess] /usr/local/lib/python3.9/site-packages/celery/fixups/django.py:205: UserWarning: Using settings.DEBUG leads to a memory
            leak, never use this setting in production environments!
  warnings.warn('''Using settings.DEBUG leads to a memory
[2021-09-10 19:59:55,020: INFO/MainProcess] celery@948692de8c79 ready.
[2021-09-10 19:59:59,464: INFO/MainProcess] Events of group {task} enabled by remote.

dashboard logs

dashboard_1    | [I 210910 19:59:54 command:135] Visit me at http://localhost:5555
dashboard_1    | [I 210910 19:59:54 command:142] Broker: redis://redis:6379/0
dashboard_1    | [I 210910 19:59:54 command:143] Registered tasks: 
dashboard_1    |     ['backend.tasks.save_hackathon_to_db',
dashboard_1    |      'celery.accumulate',
dashboard_1    |      'celery.backend_cleanup',
dashboard_1    |      'celery.chain',
dashboard_1    |      'celery.chord',
dashboard_1    |      'celery.chord_unlock',
dashboard_1    |      'celery.chunks',
dashboard_1    |      'celery.group',
dashboard_1    |      'celery.map',
dashboard_1    |      'celery.starmap']
dashboard_1    | [I 210910 19:59:54 mixins:229] Connected to redis://redis:6379/0
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method reserved failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method scheduled failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method conf failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method active_queues failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method stats failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method registered failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method active failed
dashboard_1    | [W 210910 19:59:55 inspector:42] Inspect method revoked failed
A.K.M. Adib
  • 517
  • 1
  • 7
  • 22

1 Answers1

1

What we know is that:

  • Your Django app is running because it can accept the GET / POST requests
    • e.g. python manage.py runserver
  • Your Celery worker (consumer) is running because it can accept the asynchronous .delay() requests from the Django app
    • e.g. celery --app=my_proj worker --loglevel=INFO

Possible Solution 1

Check if the Celery scheduler (producer) is also running.

  • e.g. celery --app=my_proj beat --loglevel=INFO

Unless you embed the scheduler to run inside the worker, take note that they are different entities. The worker a.k.a the consumer is the one that will receive/process the request, while the scheduler a.k.a the producer is the one that will send/trigger the request. Without a running scheduler, of course the scheduled tasks wouldn't be enqueued thus no processing would happen.

More info with running the scheduler here.

Possible Solution 2

Check for error logs in the worker instance (the consumer). Probably, the scheduler (the producer) is running and enqueues the task but the worker fails to process them, perhaps due to the task implementation isn't found. In that case, make sure to include the location of the task in the celery app's configuration e.g. celery_app.conf.update(imports=['backend.tasks']).


To verify if the scheduler is running correctly, the logs must show that it correctly enqueues the task every 3 minutes:

$ celery --app=my_proj beat --loglevel=INFO
celery beat v5.1.2 (sun-harmonics) is starting.
__    -    ... __   -        _
LocalTime -> 2021-09-11 01:38:05
Configuration ->
    . broker -> amqp://guest:**@127.0.0.1:5672//
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> celery.beat.PersistentScheduler
    . db -> celerybeat-schedule
    . logfile -> [stderr]@%INFO
    . maxinterval -> 5.00 minutes (300s)
[2021-09-11 01:38:05,523: INFO/MainProcess] beat: Starting...
[2021-09-11 01:38:05,544: INFO/MainProcess] Scheduler: Sending due task backend.tasks.save_hackathon_to_db (devpost_api_task)
[2021-09-11 01:41:05,613: INFO/MainProcess] Scheduler: Sending due task backend.tasks.save_hackathon_to_db (devpost_api_task)
[2021-09-11 01:44:05,638: INFO/MainProcess] Scheduler: Sending due task backend.tasks.save_hackathon_to_db (devpost_api_task)
...
  • So I have a `docker-compose.yml` which sets everything up and I've included this in my answer – A.K.M. Adib Sep 10 '21 at 19:56
  • 1
    Based on the logs you pasted in the question, seems like the beat scheduler didn't run successfully? I added sample logs in my answer that you could check. Basically, you should check if the log `[2021-09-11 01:38:05,544: INFO/MainProcess] Scheduler: Sending due task backend.tasks.save_hackathon_to_db (devpost_api_task)` is visible in your beat scheduler. Otherwise, your scheduler instance didn't run successfully. – Niel Godfrey Pablo Ponciano Sep 11 '21 at 01:45
  • It works after I delete `celerybeat.pid` before I run `docker-compose up --build` do you know why this might be the case? And if so, should I add in `entrypoint.sh` for a way to check if `celerybeat.pid` exists so it can be deleted beforehand? – A.K.M. Adib Sep 11 '21 at 02:50
  • That shouldn't be necessary. Perhaps you just triggered an earlier scheduler and kept it running in the background. Under normal circumstances, it shouldn't happen. – Niel Godfrey Pablo Ponciano Sep 11 '21 at 03:01
  • Apparently if you run Celery beat in a container this happens? I looked into this https://stackoverflow.com/questions/53521959/dockercelery-error-pidfile-celerybeat-pid-already-exists which lead me to this https://stackoverflow.com/questions/10876246/disable-pidfile-for-celerybeat/17674248#17674248. I have solved this my changing the command in `celery-beat` to ` command: celery -A backend beat --pidfile= -l info` which creates a non-existent `pidfile` so I don't have to delete the `pidfile` each time it is created – A.K.M. Adib Sep 11 '21 at 03:41