3

I have a DAG that I need to run only one instance at the same time. To solve this I am using max_active_runs=1 which works fine:

dag_args = {
    'owner': 'Owner',
    'depends_on_past': False,
    'start_date': datetime(2018, 01, 1, 12, 00),
    'email_on_failure': False
}

sched = timedelta(hours=1)
dag = DAG(job_id, default_args=dag_args, schedule_interval=sched, max_active_runs=1)

The problem is:

When DAG is going to be triggered and there's an instance running, AirFlow waits for this run to finish and then triggers the DAG again.

My question is:

Is there any way to skip this run so DAG will not run after this execution in this case?

Thanks!

briba
  • 2,857
  • 2
  • 31
  • 59
  • 3
    You can place a task in your dag, which checks if there is any dag with the same dag_id having state as 'running' in dag_run table of airflow database, if there are two instances running for the same dag you can make the dag to fail. – Shahbaz Ali Nov 03 '19 at 17:39
  • @ShahbazAli that's a good idea! But how I can find DAG ID and how can I query this? Thanks! – briba Nov 04 '19 at 13:20
  • And note that they are not running together since I am setting max_active_runs to 1. AirFlow is just holding the trigger until the previous one finishes. – briba Nov 04 '19 at 13:23
  • I can try something like this but not sure if it's the best way to solve this: https://stackoverflow.com/questions/43732642/status-of-airflow-task-within-the-dag – briba Nov 04 '19 at 13:25

1 Answers1

-3

This is just from checking the docs, but it looks like you only need to add another parameter:

catchup=False

catchup (bool) – Perform scheduler catchup (or only run latest)? Defaults to True

PuppyKhan
  • 115
  • 1
  • 6