585

I've been working on a web app using Django, and I'm curious if there is a way to schedule a job to run periodically.

Basically I just want to run through the database and make some calculations/updates on an automatic, regular basis, but I can't seem to find any documentation on doing this.

Does anyone know how to set this up?

To clarify: I know I can set up a cron job to do this, but I'm curious if there is some feature in Django that provides this functionality. I'd like people to be able to deploy this app themselves without having to do much config (preferably zero).

I've considered triggering these actions "retroactively" by simply checking if a job should have been run since the last time a request was sent to the site, but I'm hoping for something a bit cleaner.

Suncatcher
  • 10,355
  • 10
  • 52
  • 90
TM.
  • 108,298
  • 33
  • 122
  • 127

26 Answers26

404

One solution that I have employed is to do this:

1) Create a custom management command, e.g.

python manage.py my_cool_command

2) Use cron (on Linux) or at (on Windows) to run my command at the required times.

This is a simple solution that doesn't require installing a heavy AMQP stack. However there are nice advantages to using something like Celery, mentioned in the other answers. In particular, with Celery it is nice to not have to spread your application logic out into crontab files. However the cron solution works quite nicely for a small to medium sized application and where you don't want a lot of external dependencies.

EDIT:

In later version of windows the at command is deprecated for Windows 8, Server 2012 and above. You can use schtasks.exe for same use.

**** UPDATE **** This the new link of django doc for writing the custom management command

Aniruddh Agarwal
  • 900
  • 1
  • 7
  • 22
Brian Neal
  • 31,821
  • 7
  • 55
  • 59
  • 7
    Is this a way to do this without external services but using an only running django framework process? – sergzach Oct 14 '11 at 13:57
  • 4
    @Brian_Neal django_cron application. – sergzach Dec 04 '11 at 22:13
  • 2
    Please help me understand how will I run a management command in a virtual environment using cron on the last day of every month. – mmrs151 Mar 29 '12 at 23:17
  • 2
    @sergzach I followed up on this comment and it turns out there are two packages with this name. The [django-cron on Google Code](http://code.google.com/p/django-cron/) and the [django-cron on Github](https://github.com/Tivix/django-cron). They are slightly different but both interesting. Both allow you to define crons in a 'Djangonic' way. The first one is a bit older and aims to work without an external task (i.e. the cron). The second one on the other hand requires you to set a cron to run `python manage.py runcrons` which then runs all crons you have defined and registered. – floer32 Oct 18 '12 at 14:18
  • @seafangs It's difficult to use django-cron with multiprocess application server (synchronization problem). For example in case when you have several instances of uwsgi. For such cases you should start schedule tasks as separated process. This process can have several well-synchronized threads. Please correct me if I am not right. – sergzach Oct 18 '12 at 14:47
  • 1
    @sergzach I am assuming you are referring to the first one, "django-cron on Google Code". You are right about that one. This is actually why I opt for the second one, "django-cron on GitHub", because it makes it so you have a simple crontab setup/management - only one crontab, referring to the management command - but since you are using a separate cron process you avoid this synchronization issue (as far as I can tell). – floer32 Oct 18 '12 at 15:26
  • Celery will work, but it is quite a lot of work to install and keep working. I set it up for a fairly infrequent background task (gets run between twice a day and once a week). It was a bit overkill, and stopped working after the sysadmin made some server changes, and was a nightmare to get working again, despite taking notes on the initial install. – wobbily_col Aug 13 '14 at 09:12
  • Celery is a much better alternative, and is a breeze to set up (the above three-year-old comment is out of date, regarding complicaitons). You'll also likely need celery for other async tasks. – WhyNotHugo Feb 06 '17 at 00:58
  • FYI for anyone that runs into this later, I created a shell script, activated my virtual env, cd'ed into the Django directory, then ran python manage.py my_cool_command. – Scott Skiles Jun 29 '18 at 17:23
172

Celery is a distributed task queue, built on AMQP (RabbitMQ). It also handles periodic tasks in a cron-like fashion (see periodic tasks). Depending on your app, it might be worth a gander.

Celery is pretty easy to set up with django (docs), and periodic tasks will actually skip missed tasks in case of a downtime. Celery also has built-in retry mechanisms, in case a task fails.

WhyNotHugo
  • 9,423
  • 6
  • 62
  • 70
dln
  • 1,849
  • 1
  • 11
  • 3
55

We've open-sourced what I think is a structured app. that Brian's solution above alludes too. We would love any / all feedback!

https://github.com/tivix/django-cron

It comes with one management command:

./manage.py runcrons

That does the job. Each cron is modeled as a class (so its all OO) and each cron runs at a different frequency and we make sure the same cron type doesn't run in parallel (in case crons themselves take longer time to run than their frequency!)

leonheess
  • 16,068
  • 14
  • 77
  • 112
chachra
  • 769
  • 6
  • 7
  • 7
    @chachra Sorry, I know this might be a dumb question, but will this work on windows through `at` or it was design specifically to work with `cron`? – Bruno Finger Oct 05 '15 at 20:27
  • 1
    @BrunoFinger It uses python classes, so it's basically just python, no platform specific command needed. – Hagai Wild Aug 16 '21 at 08:08
37

If you're using a standard POSIX OS, you use cron.

If you're using Windows, you use at.

Write a Django management command to

  1. Figure out what platform they're on.

  2. Either execute the appropriate "AT" command for your users, or update the crontab for your users.

xuhdev
  • 8,018
  • 2
  • 41
  • 69
S.Lott
  • 384,516
  • 81
  • 508
  • 779
23

Interesting new pluggable Django app: django-chronograph

You only have to add one cron entry which acts as a timer, and you have a very nice Django admin interface into the scripts to run.

Austin Adams
  • 6,535
  • 3
  • 23
  • 27
Van Gale
  • 43,536
  • 9
  • 71
  • 81
  • 2
    django-chronograph is unmaintained. It's fork is doing much better: https://github.com/chrisspen/django-chroniker – Menda Apr 12 '19 at 13:30
16

Look at Django Poor Man's Cron which is a Django app that makes use of spambots, search engine indexing robots and alike to run scheduled tasks in approximately regular intervals

See: http://code.google.com/p/django-poormanscron/

user41767
  • 1,217
  • 1
  • 17
  • 26
  • 2
    This also assumes that your Django app is accessible from the web, which would not be the case for deployments on LANs and VPNs. – TimH - Codidact Mar 14 '17 at 19:02
15

I had exactly the same requirement a while ago, and ended up solving it using APScheduler (User Guide)

It makes scheduling jobs super simple, and keeps it independent for from request-based execution of some code. Following is a simple example.

from apscheduler.schedulers.background import BackgroundScheduler

scheduler = BackgroundScheduler()
job = None

def tick():
    print('One tick!')\

def start_job():
    global job
    job = scheduler.add_job(tick, 'interval', seconds=3600)
    try:
        scheduler.start()
    except:
        pass

Hope this helps somebody!

PhoenixDev
  • 746
  • 2
  • 9
  • 22
  • 2
    How do you include this in the Django app? Are you creating the scheduler in the `wsgi.py`? Or is this running as a completely separate process? – tbrlpld Aug 13 '20 at 01:15
  • hey @PhoenixDev, how do i utilize this in a django project structure? where do i put this scheduler? would appreciate any sugestion. – P S Solanki Aug 09 '21 at 09:55
11

Django APScheduler for Scheduler Jobs. Advanced Python Scheduler (APScheduler) is a Python library that lets you schedule your Python code to be executed later, either just once or periodically. You can add new jobs or remove old ones on the fly as you please.

note: I'm the author of this library

Install APScheduler

pip install apscheduler

View file function to call

file name: scheduler_jobs.py

def FirstCronTest():
    print("")
    print("I am executed..!")

Configuring the scheduler

make execute.py file and add the below codes

from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()

Your written functions Here, the scheduler functions are written in scheduler_jobs

import scheduler_jobs 

scheduler.add_job(scheduler_jobs.FirstCronTest, 'interval', seconds=10)
scheduler.start()

Link the File for Execution

Now, add the below line in the bottom of Url file

import execute
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Chandan Sharma
  • 2,321
  • 22
  • 22
10

Brian Neal's suggestion of running management commands via cron works well, but if you're looking for something a little more robust (yet not as elaborate as Celery) I'd look into a library like Kronos:

# app/cron.py

import kronos

@kronos.register('0 * * * *')
def task():
    pass
Johannes Gorset
  • 8,715
  • 4
  • 36
  • 34
10

RabbitMQ and Celery have more features and task handling capabilities than Cron. If task failure isn't an issue, and you think you will handle broken tasks in the next call, then Cron is sufficient.

Celery & AMQP will let you handle the broken task, and it will get executed again by another worker (Celery workers listen for the next task to work on), until the task's max_retries attribute is reached. You can even invoke tasks on failure, like logging the failure, or sending an email to the admin once the max_retries has been reached.

And you can distribute Celery and AMQP servers when you need to scale your application.

sleblanc
  • 3,821
  • 1
  • 34
  • 42
Ravi Kumar
  • 1,382
  • 16
  • 22
8

Although not part of Django, Airflow is a more recent project (as of 2016) that is useful for task management.

Airflow is a workflow automation and scheduling system that can be used to author and manage data pipelines. A web-based UI provides the developer with a range of options for managing and viewing these pipelines.

Airflow is written in Python and is built using Flask.

Airflow was created by Maxime Beauchemin at Airbnb and open sourced in the spring of 2015. It joined the Apache Software Foundation’s incubation program in the winter of 2016. Here is the Git project page and some addition background information.

Alexander
  • 105,104
  • 32
  • 201
  • 196
8

I personally use cron, but the Jobs Scheduling parts of django-extensions looks interesting.

chhantyal
  • 11,874
  • 7
  • 51
  • 77
Van Gale
  • 43,536
  • 9
  • 71
  • 81
  • Still depends on cron for triggering, just adds another abstraction layer in between. Not sure it's worth it, personally. – Carl Meyer Feb 23 '09 at 02:05
  • I agree, and after thinking about it I don't want request middleware slowing down my site (ala poormanscron above) when cron can do the job better anyway. – Van Gale Feb 23 '09 at 05:31
  • Is there any sample for django_extensions ? The docs is not enough for a complete guide. – Semih Sep 13 '22 at 05:12
6

Put the following at the top of your cron.py file:

#!/usr/bin/python
import os, sys
sys.path.append('/path/to/') # the parent directory of the project
sys.path.append('/path/to/project') # these lines only needed if not on path
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproj.settings'

# imports and code below
Matt McCormick
  • 13,041
  • 22
  • 75
  • 83
6

I just thought about this rather simple solution:

  1. Define a view function do_work(req, param) like you would with any other view, with URL mapping, return a HttpResponse and so on.
  2. Set up a cron job with your timing preferences (or using AT or Scheduled Tasks in Windows) which runs curl http://localhost/your/mapped/url?param=value.

You can add parameters but just adding parameters to the URL.

Tell me what you guys think.

[Update] I'm now using runjob command from django-extensions instead of curl.

My cron looks something like this:

@hourly python /path/to/project/manage.py runjobs hourly

... and so on for daily, monthly, etc'. You can also set it up to run a specific job.

I find it more managable and a cleaner. Doesn't require mapping a URL to a view. Just define your job class and crontab and you're set.

Michael
  • 2,826
  • 4
  • 25
  • 18
  • 1
    only problem am sensing is un-necessarily adding load to the app and bandwidth just to run a background job that would better be launched "internally" and independent of the serving app. But other than that, this is a clever n more generic django-cron because it can even be invoked by agents external to the app's server! – JWL Jan 25 '12 at 17:51
  • You are right, that's why I moved to using jobs from django-command-extensions. See my update to my answer. – Michael Jan 25 '12 at 21:16
4

You should definitely check out django-q! It requires no additional configuration and has quite possibly everything needed to handle any production issues on commercial projects.

It's actively developed and integrates very well with django, django ORM, mongo, redis. Here is my configuration:

# django-q
# -------------------------------------------------------------------------
# See: http://django-q.readthedocs.io/en/latest/configure.html
Q_CLUSTER = {
    # Match recommended settings from docs.
    'name': 'DjangoORM',
    'workers': 4,
    'queue_limit': 50,
    'bulk': 10,
    'orm': 'default',

# Custom Settings
# ---------------
# Limit the amount of successful tasks saved to Django.
'save_limit': 10000,

# See https://github.com/Koed00/django-q/issues/110.
'catch_up': False,

# Number of seconds a worker can spend on a task before it's terminated.
'timeout': 60 * 5,

# Number of seconds a broker will wait for a cluster to finish a task before presenting it again. This needs to be
# longer than `timeout`, otherwise the same task will be processed multiple times.
'retry': 60 * 6,

# Whether to force all async() calls to be run with sync=True (making them synchronous).
'sync': False,

# Redirect worker exceptions directly to Sentry error reporter.
'error_reporter': {
    'sentry': RAVEN_CONFIG,
},
}
saran3h
  • 12,353
  • 4
  • 42
  • 54
4

after the part of code,I can write anything just like my views.py :)

#######################################
import os,sys
sys.path.append('/home/administrator/development/store')
os.environ['DJANGO_SETTINGS_MODULE']='store.settings'
from django.core.management impor setup_environ
from store import settings
setup_environ(settings)
#######################################

from http://www.cotellese.net/2007/09/27/running-external-scripts-against-django-models/

xiaohei
  • 41
  • 1
3

Yes, the method above is so great. And I tried some of them. At last, I found a method like this:

    from threading import Timer

    def sync():

        do something...

        sync_timer = Timer(self.interval, sync, ())
        sync_timer.start()

Just like Recursive.

Ok, I hope this method can meet your requirement. :)

Ni Xiaoni
  • 1,639
  • 2
  • 13
  • 10
  • 1
    Will stop if your 'something' ever fails, so make sure you handle all exceptions within it. Even then, the web server might kill your thread at some point, might it not? – Lutz Prechelt Nov 14 '14 at 16:53
3

A more modern solution (compared to Celery) is Django Q: https://django-q.readthedocs.io/en/latest/index.html

It has great documentation and is easy to grok. Windows support is lacking, because Windows does not support process forking. But it works fine if you create your dev environment using the Windows for Linux Subsystem.

devdrc
  • 1,853
  • 16
  • 21
  • It seems you can [still use it](https://django-q.readthedocs.io/en/latest/install.html#windows) in a single-cluster mode on Windows – Yushin Washio Sep 04 '18 at 15:02
2

I had something similar with your problem today.

I didn't wanted to have it handled by the server trhough cron (and most of the libs were just cron helpers in the end).

So i've created a scheduling module and attached it to the init .

It's not the best approach, but it helps me to have all the code in a single place and with its execution related to the main app.

Fabricio Buzeto
  • 1,243
  • 1
  • 18
  • 29
1

I use celery to create my periodical tasks. First you need to install it as follows:

pip install django-celery

Don't forget to register django-celery in your settings and then you could do something like this:

from celery import task
from celery.decorators import periodic_task
from celery.task.schedules import crontab
from celery.utils.log import get_task_logger
@periodic_task(run_every=crontab(minute="0", hour="23"))
def do_every_midnight():
 #your code
Peter Brittain
  • 13,489
  • 3
  • 41
  • 57
  • 2
    I notice that this advice is out of date and you can integrate celery directly. See https://pypi.python.org/pypi/django-celery for details. – Peter Brittain Aug 31 '15 at 22:25
  • [Celery docs](http://docs.celeryproject.org/en/latest/django/first-steps-with-django.html) say that this was a change in v3.1. I've not tried it myself yet. – Peter Brittain Sep 01 '15 at 16:02
1

I am not sure will this be useful for anyone, since I had to provide other users of the system to schedule the jobs, without giving them access to the actual server(windows) Task Scheduler, I created this reusable app.

Please note users have access to one shared folder on server where they can create required command/task/.bat file. This task then can be scheduled using this app.

App name is Django_Windows_Scheduler

ScreenShot: enter image description here

just10minutes
  • 583
  • 10
  • 26
0

If you want something more reliable than Celery, try TaskHawk which is built on top of AWS SQS/SNS.

Refer: http://taskhawk.readthedocs.io

Sri
  • 4,613
  • 2
  • 39
  • 42
0

For simple dockerized projects, I could not really see any existing answer fit.

So I wrote a very barebones solution without the need of external libraries or triggers, which runs on its own. No external os-cron needed, should work in every environment.

It works by adding a middleware: middleware.py

import threading

def should_run(name, seconds_interval):
    from application.models import CronJob
    from django.utils.timezone import now

    try:
        c = CronJob.objects.get(name=name)
    except CronJob.DoesNotExist:
        CronJob(name=name, last_ran=now()).save()
        return True

    if (now() - c.last_ran).total_seconds() >= seconds_interval:
        c.last_ran = now()
        c.save()
        return True

    return False


class CronTask:
    def __init__(self, name, seconds_interval, function):
        self.name = name
        self.seconds_interval = seconds_interval
        self.function = function


def cron_worker(*_):
    if not should_run("main", 60):
        return

    # customize this part:
    from application.models import Event
    tasks = [
        CronTask("events", 60 * 30, Event.clean_stale_objects),
        # ...
    ]

    for task in tasks:
        if should_run(task.name, task.seconds_interval):
            task.function()


def cron_middleware(get_response):

    def middleware(request):
        response = get_response(request)
        threading.Thread(target=cron_worker).start()
        return response

    return middleware

models/cron.py:

from django.db import models


class CronJob(models.Model):
    name = models.CharField(max_length=10, primary_key=True)
    last_ran = models.DateTimeField()

settings.py:

MIDDLEWARE = [
    ...
    'application.middleware.cron_middleware',
    ...
]
yspreen
  • 1,759
  • 2
  • 20
  • 44
0

Simple way is to write a custom shell command see Django Documentation and execute it using a cronjob on linux. However i would highly recommend using a message broker like RabbitMQ coupled with celery. Maybe you can have a look at this Tutorial

Hamfri
  • 1,979
  • 24
  • 28
0

One alternative is to use Rocketry:

from rocketry import Rocketry
from rocketry.conds import daily, after_success

app = Rocketry()

@app.task(daily.at("10:00"))
def do_daily():
    ...

@app.task(after_success(do_daily))
def do_after_another():
    ...

if __name__ == "__main__":
    app.run()

It also supports custom conditions:

from pathlib import Path

@app.cond()
def file_exists(file):
    return Path(file).exists()

@app.task(daily & file_exists("myfile.csv"))
def do_custom():
    ...

And it also supports Cron:

from rocketry.conds import cron

@app.task(cron('*/2 12-18 * Oct Fri'))
def do_cron():
    ...

It can be integrated quite nicely with FastAPI and I think it could be integrated with Django as well as Rocketry is essentially just a sophisticated loop that can spawn, async tasks, threads and processes.

Disclaimer: I'm the author.

miksus
  • 2,426
  • 1
  • 18
  • 34
0

Another option, similar to Brian Neal's answer it to use RunScripts

Then you don't need to set up commands. This has the advantage of more flexible or cleaner folder structures.

This file must implement a run() function. This is what gets called when you run the script. You can import any models or other parts of your django project to use in these scripts.

And then, just

python manage.py runscript path.to.script
Roman
  • 8,826
  • 10
  • 63
  • 103