How to design a distributed job scheduler?

Question

I want to design a job scheduler cluster, which contains several hosts to do cron job scheduling. For example, a job which needs run every 5 minutes is submitted to the cluster, the cluster should point out which host to fire next run, making sure:

Disaster tolerance: if not all of the hosts are down, the job should be fired successfully.

Validity: only one host to fire next job run.

Due to disaster tolerance, job cannot bind to a specific host. One way is all the hosts polling a DB table(certainly with lock), this guaranteed only one host gets the next job run. Since it often locks table, is there any better design?

score 6 · Answer 1 · answered Nov 12 '14 at 15:22

6

Use the Quartz framework for that. It has a cron like syntax, can be clustered and only one of the hosts in the cluster will do one job at a time. If a host or job fails, another host will retry the pending job.

answered Nov 12 '14 at 15:22

Stefan

12,108
5
47
66

1

Quartz is a pretty framework to do job scheduling, and it also supports distributed, while I want to know how Quartz's cluster is designed. – coderz Nov 12 '14 at 15:24
2

coderz, quartz with cluster works like a simple quartz configuration would. you just set the org.quartz.jobStore.isClustered = true, add the quartz tables to the database and quartz will take care of the disaster tolerance and the run-only-once. For more info on how quartz clustering works you can read http://www.quartz-scheduler.org/generated/2.2.1/html/qs-all/#page/Quartz_Scheduler_Documentation_Set%2Fre-cls_cluster_configuration.html%23 – Marios Nov 13 '14 at 08:15
Quartz(http://quartz-scheduler.org/) is unnecessarily complex for a scheduler in terms of setup and infra. Dkron(http://dkron.io/) doesnt do very well if the dkron server has a downtime. The best solution i have come across is temporal.io(https://temporal.io). It is a distributed cron scheduler and workflow orchestrator. Runs on docker as small as 500MB RAM t3a.nano via ECS with database supported by astra db (hosted cassandra). Temporal executes the schedules even when temporal itself has a downtime and picks up the crons it has missed during its downtime – thehellmaker Dec 03 '22 at 05:18

score 6 · Answer 2 · answered Nov 08 '17 at 13:31

6

I googled out the Dkron (Distributed job scheduling system). It has rest api and looks good. I plan try to use it Dkron site

answered Nov 08 '17 at 13:31

shcherbak

738
8
14

We built the one from ground up. You can read the writeup here https://medium.com/myntra-engineering/myntra-scheduler-service-a0153a04526c – Sanjeev Kumar Dangi May 27 '19 at 14:19

score 3 · Answer 3 · answered Sep 08 '18 at 21:09

3

I'm not sure how to design one, but there are open-source products that do that which can serve as an example. One is Quartz scheduler that is mentioned above.

But, apparently, WallmartLabs have evaluated Quartz, found it to be not good enough, and thus created and open-sourced a better (in their opinion) alternative to it called BigBen. Perhaps you could also look at that one.

answered Sep 08 '18 at 21:09

mvmn

3,717
27
30

2

Thanks for sharing the BigBen, nice article! – coderz Nov 09 '18 at 04:06

score 2 · Answer 4 · answered May 08 '15 at 18:32

2

Consider using AWS Simple Workflow Service if you are OK with using AWS web services. The benefit over something like Quartz is that it doesn't depend on database which you have to host and it can provide much more than scheduling. For example it can run some activities that fix your cluster or page you if scheduling is not possible for any reason. Here is an example of a cron workflow.

answered May 08 '15 at 18:32

Maxim Fateev

6,458
3
20
35

Yes, another good way! While I'm wondering how the framework is designed. – coderz May 09 '15 at 02:48
2

temporal.io is the open source solution based on AWS Simple Workflow Service ideas. – Maxim Fateev Jul 04 '21 at 22:06

Marco Haschka · Answer 5 · 2016-12-16T17:41:56.167

I did require something like this long ago, when synchronisation was done with floppy disks. You should be clear about three things, which seem to be simple, but in distributed environment the arent :-)

"Synchronisation Sections" If you get a net split, which means your cluster is split in two seperate sections wich can communicate inside the sections, but not between the two sections, the "fire the job exactly once" can only acquired per synchronisation section.

"Disaster" If almost all times all computers are up and running and only very seldom one fails, and the failure of two is almost unthinkable, its a completely different thing, than every host is running only part time, the connections are unstable, or the synchronisation is done by dial-up connections or by floppys. If you want even deal with a net split, it becomes really really complicated. If you want to deal with malicious hosts, you have another Problem.

"Validity" Fire every job exactly once... you have to synchronize faster than the job firing interval.

edit: Tipp for scheduler-tasks design. I have a big text file, wich contains lines. Every line is a job task, starting with job-type, then time to execute, then command and last but not least a optional resubmission-interval for repeating tasks. Syncing means merging. Executed tasks are deleted. If resubmission is on, then a new task is inserted or appended.

In an ideal world, every host ist allways connected to the others, I would implement something like a token ring. If there is no master, one is selected by the hosts, and the master is expected to schedule everything until he is not sending heardbeats for some time. If there are two masters, they negotiate for one of them to become master(maybe lower MAC-Adress... whatever).

If you have to deal with malicious hosts, you can use some byzantine gerenals-problem solution. The selection of the master is allready pretty good proofed against malicious hosts. With a little bit of rsa-krypto the selected master can signature every command, resend attacks can be treated with timestamps or growing indices... voila.

only as a story from an onld programmer, not intended for today everything is allways connected to the internet world: My big problem about 20 years ago was, that the hosts were synchronized from once a hour and once a day to once a week or once a month. So the solution was to have different commands: 1. execute on every host at a given date (wich is far enough in the future for synchronisation) 2. execute on a host, where "whoami" contains a certain substring. 3. execute on a random host with little probability, and send an acknowledgement to all others, that it is allready executed.

The third command-type does something like "fire only once", if the synchronisation is much faster than the probability of execution. It needs no master-slave architecture and it works pretty well, if you know the synchronisation intervalls.

score 1 · Answer 6 · answered Jul 27 '16 at 21:17

1

Check out Chronos (https://mesos.github.io/chronos/) which runs on top of Mesos - (https://mesos.apache.org/) resource scheduler.

answered Jul 27 '16 at 21:17

gnurik

71
2

1

Please substantiate your answer with more details, instead of just linking to an external site – adao7000 Jul 27 '16 at 21:33
1

Mesos is a system for managing compute resources, where a compute resource can be anything like a script, a web service, a hadoop/spark job, and it's language agnostic. It's aware of physical resources (CPU, Mem, etc) available in your cluster and can allocate jobs according to where resources are available. Chronos runs on top of Mesos and provides cron-like scheduling capability so that you can schedule recurring tasks and it's also language agnostic. I.e. Chronos schedules and submits your job to Mesos and Mesos figures out what host to run it on. – gnurik Jul 29 '16 at 00:19

How to design a distributed job scheduler?

6 Answers6

Linked