6

Related: Quartz Clustering - triggers duplicated when the server starts

I'm using Quartz Scheduler to manage scheduled jobs in a java-based clustered environment. There are a handful of nodes in the cluster at any given time, and they all run Quartz, backed by a data store in a postgresql database that all nodes connect to.

When an instance is initialized, it tries to create or update the jobs and triggers in the Quartz data store by executing this code:

private void createOrUpdateJob(JobKey jobKey, Class<? extends org.quartz.Job> clazz, Trigger trigger) throws SchedulerException {
    JobBuilder jobBuilder = JobBuilder.newJob(clazz).withIdentity(jobKey);
    if (!scheduler.checkExists(jobKey)) {
        // if the job doesn't already exist, we can create it, along with its trigger. this prevents us
        // from creating multiple instances of the same job when running in a clustered environment
        scheduler.scheduleJob(jobBuilder.build(), trigger);
        log.error("SCHEDULED JOB WITH KEY " + jobKey.toString());
    } else {
        // if the job has exactly one trigger, we can just reschedule it, which allows us to update the schedule for
        // that trigger.
        List<? extends Trigger> triggers = scheduler.getTriggersOfJob(jobKey);
        if (triggers.size() == 1) {
            scheduler.rescheduleJob(triggers.get(0).getKey(), trigger);
            return;
        }

        // if for some reason the job has multiple triggers, it's easiest to just delete and re-create the job,
        // since we want to enforce a one-to-one relationship between jobs and triggers
        scheduler.deleteJob(jobKey);
        scheduler.scheduleJob(jobBuilder.build(), trigger);
    }
}

This approach solves a number of problems:

  1. If the environment is not properly configured (i.e. jobs/triggers don't exist), then they will be created by the first instance that launches
  2. If the job already exists, but I want to modify its schedule (change a job that used to run every 7 minutes to now run every 5 minutes), I can define a new trigger for it, and a redeploy will reschedule the triggers in the database
  3. Exactly one instance of a job will be created, because we always refer to jobs by the specified JobKey, which is defined by the job itself. This means that jobs (and their associated triggers) are created exactly once, regardless of how many nodes are in the cluster, or how many times we deploy.

This is all well and good, but I'm concerned about a potential race condition when two instances are started at exactly the same time. Because there's no global lock around this code that all nodes in the cluster will respect, if two instances come online at the same time, I could end up with duplicate jobs or triggers, which kind of defeats the point of this code.

Is there a best practice for automatically defining Quartz jobs and triggers in a clustered environment? Or do I need to resort to setting my own lock?

Community
  • 1
  • 1
MusikPolice
  • 1,699
  • 4
  • 19
  • 38

2 Answers2

1

I am not sure if there is a better way to do this in Quartz. But in case you are already using Redis or Memcache, I would recommend letting all instances perform an atomic increment against a well known key. If the code you pasted is supposed to run only one job per cluster per hour, you could do the following:

long timestamp = System.currentTimeMillis() / 1000 / 60 / 60;
String key = String.format("%s_%d", jobId, timestamp);

// this will only be true for one instance in the cluster per (job, timestamp) tuple
bool shouldExecute = redis.incr(key) == 1

if (shouldExecute) {
  // run the mutually exclusive code
}

The timestamp gives you a moving window within which jobs are competing to execute this job.

Ingo
  • 1,552
  • 10
  • 31
  • That's not exactly what I was looking for, in the sense that it doesn't prevent Quartz from creating multiple triggers for the same job, but it does ensure that only one of those triggers will execute within a given window, so I think it does solve the problem, if in a roundabout sort of way. – MusikPolice May 17 '16 at 20:24
0

I had (almost) the same problem: How to create triggers and jobs exactly once per software version in clustered environment. I solved the problem by assigning one of the cluster nodes to be a lead node during start-up and letting it to re-create the Quartz jobs. The lead node is the one, which first successfully inserts the git revision number of the running software to the database. Other nodes use the Quartz configuration created by the lead node. Here's complete solution: https://github.com/perttuta/quartz

Perttu T
  • 561
  • 1
  • 5
  • 7