19

How to configure the Airflow dag to execute at specified time on daily basis no matter what happens, something exactly like crons.

I know that similar behaviour could be obtained using TimeSensor, but in this case it depends upon the sensor tasks and which might conflict with the dag execution time.

Example: With sensor approach if I have sensor to run at 0 hour 15th minutes but if dag is executed at later then my task is delayed, so even for the sensor approach I need to make sure that the Dag is executed on right time.

So how to make sure that Dag is executed at specified time?

samarth
  • 3,866
  • 7
  • 45
  • 60

3 Answers3

20

To start a DAG for example everyday on 2:30 AM in the morning you can do the following:

DAG(
   dag_id='dag_id',
   # start date:28-03-2017
   start_date= datetime(year=2017, month=3, day=28),
   # run this dag at 2 hours 30 min interval from 00:00 28-03-2017
   schedule_interval='30 2 * * *')

Before configuring the schedule the interpretation of the cron interval can verified and tested here: https://crontab.guru/

Javed
  • 5,904
  • 4
  • 46
  • 71
  • airflow 1.8.0, `schedule_interval='0 24 * * *'` doesn't work for me, got a nuclear bomb blast ascii art and the following error message: `CroniterBadCronError: [0 24 * * * ] is not acceptable, out of range` – ruhong Aug 02 '17 at 06:30
  • Reference for the cron job format https://stackoverflow.com/questions/14710257/running-a-cron-job-at-230-am-everyday – Javed Aug 02 '17 at 06:35
  • @ruhong that is exactly why he suggested to check the cron on the link – cryanbhu Jul 31 '18 at 09:09
  • How do we set up DAG to run every other day at 11 PM? Not daily. – jscriptor Jul 16 '19 at 19:56
  • https://crontab.guru/#0_23_1-31/2_*_* , https://unix.stackexchange.com/a/16094/345195 – Javed Jul 23 '19 at 07:58
0

@ruhong I see in a comment you are wondering how to do every other day. The Month is the third parameter and if you do a 2 30 */2 * * it will run every other day (at 2:30am). It calculates it a little weird sometimes depending on the month. You can force it to run even or odd days by specifying the range:

# Will only run on odd days:
2 30 1-31/2 * * command

# Will only run on even days:
2 30 2-30/2 * * command
Carlos Mostek
  • 131
  • 1
  • 4
-1

You can set the schedule_interval to a string cron expression when you instantiate a DAG:

schedule_interval='0 * * * *'

BaseOperator documentation

obi
  • 1
  • 3
  • Yeah, I figured it out but the problem is, its not working as expected. If I set the cron string to "1,2,3 * * * *" It it executes minutes 1 at minute2 and 2 at minute 3 and 3 will be executed at minute 1of next hour. – samarth Mar 14 '16 at 10:48
  • [More details of the issue with cron expression here](http://stackoverflow.com/questions/35985813/airflow-cron-expression-is-not-scheduling-dag-properly) – samarth Mar 14 '16 at 11:07
  • 1
    Yeah, that's because airflow doesn't execute on the interval, it runs the dag after that interval has finished. Since you specify minutes 1, 2, and 3, it will run at the end of those periods, on 2, 3, and 1. That's explained in the docs [here](https://airflow.incubator.apache.org/scheduler.html?highlight=schedule_interval#scheduling-triggers) – Mercury Sep 08 '16 at 22:23