0

I have a recurring cron job that runs a Django management command. The command interacts with the ORM, sends email with sendmail, and sends SMS with Twilio. It's possible that the cron jobs will begin to overlap. In other words, the job (that runs this command) might still be executing when the next job starts to run. Will this cause any issues? (I don't want to wait for the management command to finish executing before running the management command again with cron).

EDIT:

The very beginning of the management command gets a timestamp of when the command was run. At a minimum, this timestamp needs to be accurate. It would be nice if the rest of the command didn't wait for the previous cron job to finish running, but that's non-critical.

EDIT 2:

The cron job only reads from the DB, it doesn't write to it. The application has to continue to work while the cron job is running. The application reads and writes from the DB.

Daniel
  • 1,774
  • 2
  • 22
  • 38

3 Answers3

2

My understanding of cron is that it will fork off a job as a background process, allowing multiple jobs to run at the same time. This can be problematic if the second job depends on the first job to be done (if the second is running a daily report of aggregated data provided by the first job etc...). If you don't want them to run concurrently, there are workarounds to that:

How to prevent the cron job execution, if it is already running.

Will Cron start a new job if the current job is not complete?

Community
  • 1
  • 1
vandsh
  • 1,329
  • 15
  • 12
  • Please see my edit. Wouldn't at least part of it have to run concurrently? – Daniel Nov 23 '15 at 19:06
  • If your asking if its possible if they will run concurrently at any point? Yes its possible, depending on length of time of execution and frequency of cron. If your asking if its necessary that they run concurrently I would say no, if they do need to run concurrently for some reason I would say they should be refactored into 1 script to make sure both blocks of logic are executed in order. Hope this helps, don't have much insight into what you have actually taking place. – vandsh Nov 23 '15 at 19:15
1

Yes. This could definitely cause issues. You have a race condition. If you wish, you could acquire a lock somehow on a critical section which would prevent the next invocation from entering a section of code until the first invocation of the command finished. You may be able to do a row lock or a table lock for the underlying data.

Let's presume you're using MySQL which has specific lock syntax (DB dependent) and you have this model:

class Email(models.Model):
    sent = models.BooleanField(default=False)
    subj = models.CharField(max_length=140)
    msg = models.TextField()

You can create a lock object like this:

from django.db import connection
[...]
class EmailLocks(object):
    def __init__(self):
        self.c = connection.cursor()
    def __enter__(self):
        self.c.execute('''lock tables my_app_email write''')
    def __exit__(self, *err):
        self.c.execute('unlock tables')

Then lock all of your critical sections like:

with EmailLocks():
    # read the email table and decide if you need to process it
    for e in Email.objects.filter(sent=False):
        # send the email
        # mark the email as sent
        e.sent = True
        e.save()

The lock object will automatically unlock the table on exit. Also, if you throw an exception in your code, the table will still be unlocked.

Ross Rogers
  • 23,523
  • 27
  • 108
  • 164
  • The question then would be which parts of the code need to be locked? Any ideas for how I can find an answer? – Daniel Nov 23 '15 at 19:07
  • You would need to lock the [cricital section](https://en.wikipedia.org/wiki/Critical_section). I can't know what that will be unless you post all your management code. One could guess that you have a table with messages to send, you read that table, send the messages, and then mark the messages as sent. You have to lock the tables before reading it and then unlock it after marking the message as sent. – Ross Rogers Nov 23 '15 at 19:28
  • If you can't lock writes or reads, and only your management process writes the "sent" bit ( presuming again), then you could create a dummy table explicitly for use in locking. You'd lock this proxy/dummy table before entering the critical section, and then unlock it after updating the real table. – Ross Rogers Nov 23 '15 at 19:30
  • What database are you using? I can show you how to lock. – Ross Rogers Nov 23 '15 at 19:36
  • Postgresql, latest stable version. Thanks. I think I have to use the dummy table idea. Just to check, would running nowdate = datetime.datetime.now(pytz.timezone('UTC')) at the start of the management command count as part of the critical section? I'm guessing not. – Daniel Nov 23 '15 at 19:41
  • If you're trying to write your own lock, don't. I don't know how `nowdate` is being used, so I can't comment on whether it belongs in the critical section. Just know that a lock can take a while to acquire. If you need a newer version of `nowdate`, then don't grab the value until you're in the critical section. – Ross Rogers Nov 23 '15 at 19:47
  • I think the lines where nowdate gets used are in the critical section. The management command's logic will only work properly though if nowdate stores when the command was called, not when the critical section begins executing. I just need to know whether nowdate = datetime.datetime.now(pytz.timezone('UTC')) *per se* would count as part of the critical section. I'm not asking whether other lines where nowdate gets used would count as part of the critical section. – Daniel Nov 23 '15 at 20:28
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/95964/discussion-between-daniel-and-ross-rogers). – Daniel Nov 23 '15 at 20:40
0

So you have a cron that runs django management command and you dont want them to overlap.

You can use flock, Which generates a lockfile and deletes it after executing the cron.If the second cron starts before the first one has ended it will see that there a lockfile already created and thus not execute the second one.

Below is the cron i used:

* * * * * /usr/bin/flock -n /tmp/fcj.lockfile /usr/bin/python /home/txuser/dev/Project1/projectnew/manage.py flocktest

There is lot more you can do with this. more on this

ns15
  • 5,604
  • 47
  • 51