18

I have a DAG that has been running everyday at 3:00, it ran OK for the past few weeks.

I've updated the date to run now at 7:00, but apparently for the last 2 days it didn't run. I can see the tasks for those two days with the status 'running' (in green), but no command is triggered.

Does one needs to do something more to change de running time of a DAG ?

I know that in the past one way to solve this was to clean in the meta-database the tasks for this DAG, and update the start_date, but I would rather avoid doing this again.

Anyone has a suggestion?

David Batista
  • 3,029
  • 2
  • 23
  • 42

4 Answers4

17

To schedule a dag, Airflow just looks for the last execution date and sum the schedule interval. If this time has expired it will run the dag. You cannot simple update the start date. A simple way to do this is edit your start date and schedule interval, rename your dag (e.g. xxxx_v2.py) and redeploy it.

Nevermore
  • 7,141
  • 5
  • 42
  • 64
p.magalhaes
  • 7,595
  • 10
  • 53
  • 108
  • 1
    Is there a larger discussion about the need to rename DAGs in order to reflect updated metadata? If so, can someone link to it please? – harveyxia Nov 08 '16 at 15:39
  • 2
    I couldn't find discussion but it is mentioned in pitfalls https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls – liferacer Jan 09 '17 at 03:24
  • 2
    The Common Pitfalls page moved: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62694614 – SergiyKolesnikov Sep 13 '21 at 07:31
  • A GitHub discussion about changing `schedule_interval` of an Airflow DAG can be found here: https://github.com/apache/airflow/discussions/25304. – Prabhatika Vij Aug 01 '23 at 08:11
5

An alternative solution to renaming the DAG is to edit the execution_date of all prior task instances and DAG runs of the DAG in the database. The tables to alter are task_instance and dag_run respectively.

One of the downsides of this approach is that you will lose the ability to browse logs of completed tasks through the webserver.

Conor
  • 1,509
  • 2
  • 20
  • 28
3

You can use the same dag. After modifying schedule_interval, you need to mark the previous job as succeeded via airflow backfill -m command.

Chris Feng
  • 189
  • 5
  • 19
2

David,
1. You can also delete the dag via Experimental REST API. deleting a DAG
2. Change the desired start_date.
3. And add the same DAG back.