3

I'm using django-celery-beat in a django app (this stores the schedule in the database instead of a local file). I've configured my schedule via celery_beat that Celery is initialized with via app.config_from_object(...)

I recently renamed/removed a few tasks and restarted the app. The new tasks showed up, but the tasks removed from the celery_beat dictionary didn't get removed from the database.

Is this expected workflow -- requiring manual removal of tasks from the database? Is there a workaround to automatically reconcile the schedule at Django startup?

I tried a PeriodicTask.objects.all().delete() in celery/__init__.py

def _clean_schedule():                                                         
    from django.db import transaction                                           
    from django_celery_beat.models import PeriodicTask                          
    from django_celery_beat.models import PeriodicTasks                         
    with transaction.atomic():                                                  
         PeriodicTask.objects.\                                                  
            exclude(task__startswith='celery.').\                               
            exclude(name__in=settings.CELERY_CONFIG.celery_beat.keys()).\      
            delete()                                                            
         PeriodicTasks.update_changed()                                          
_clean_schedule()           

but that is not allowed because Django isn't properly started up yet:

django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.

You also can't use Django's AppConfig.ready() because making queries / db connections in ready() is not supported.

rrauenza
  • 6,285
  • 4
  • 32
  • 57

1 Answers1

6

Looking at how django-celery-beat actually works to install the schedules, I thought I maybe I could hook into that process.

It doesn't happen when Django starts -- it happens when beat starts. It calls setup_schedule() against the class passed on the beat command line.

Therefore, we can just override the scheduler with

--scheduler=myproject.lib.scheduler:DatabaseSchedulerWithCleanup

to do cleanup:

import logging

from django_celery_beat.models import PeriodicTask                               
from django_celery_beat.models import PeriodicTasks                              
from django_celery_beat.schedulers import DatabaseScheduler                     
from django.db import transaction                                                


class DatabaseSchedulerWithCleanup(DatabaseScheduler):                           

    def setup_schedule(self):                                                    
        schedule = self.app.conf.beat_schedule                                   
        with transaction.atomic():                                               
            num, info = PeriodicTask.objects.\                                   
                exclude(task__startswith='celery.').\                            
                exclude(name__in=schedule.keys()).\                              
                delete()                                                         
            logging.info("Removed %d obsolete periodic tasks.", num)            
            if num > 0:                                                          
                PeriodicTasks.update_changed()                                   
        super(DatabaseSchedulerWithCleanup, self).setup_schedule()    

Note, you only want this if you are exclusively managing tasks with beat_schedule. If you add tasks via Django admin or programatically, they will also be deleted.

rrauenza
  • 6,285
  • 4
  • 32
  • 57