What is the difference between airflow trigger rule "all_done" and "all_success"?

Question

One of the requirement in the workflow I am working on is to wait for some event to happen for given time, if it does not happen mark the task as failed still the downstream task should be executed.

I am wondering if "all_done" means all the dependency tasks are done no matter if they have succeeded or not.

score 33 · Accepted Answer · edited Mar 01 '22 at 15:37

33

https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#concepts-trigger-rules

all_done means all operations have finished working. Maybe they succeeded, maybe not.

all_success means all operations have finished without error

So your guess is correct

edited Mar 01 '22 at 15:37

Ivo Merchiers

1,589
13
29

answered Feb 07 '17 at 14:44

Sheena

15,590
14
75
113

1

@pgoggijr yes it does, SUCCESS, FAILED, UPSTREAM_FAILED, SKIPPED all count. – Davos Dec 08 '17 at 15:08

score 15 · Answer 2 · edited Jun 12 '18 at 01:10

SUMMARY
The tasks are "all done" if the count of SUCCESS, FAILED, UPSTREAM_FAILED, SKIPPED tasks is greater than or equal to the count of all upstream tasks.

Not sure why it would be greater than? Perhaps subdags do something weird to the counts.

Tasks are "all success" if the count of upstream tasks and the count of success upstream tasks is the same.

DETAILS
The code for evaluating trigger rules is here https://github.com/apache/incubator-airflow/blob/master/airflow/ti_deps/deps/trigger_rule_dep.py#L72

ALL_DONE

The following code runs the qry and returns the first row (the query is an aggregation that will only ever return one row anyway) into the following variables:

successes, skipped, failed, upstream_failed, done = qry.first()

the "done" column in the query corresponds to this: func.count(TI.task_id) in other words a count of all the tasks matching the filter. The filter specifies that it is counting only upstream tasks, from the current dag, from the current execution date and this:

 TI.state.in_([
                    State.SUCCESS, State.FAILED,
                    State.UPSTREAM_FAILED, State.SKIPPED])

So done is a count of the upstream tasks with one of those 4 states.

Later there is this code

upstream = len(task.upstream_task_ids)
...
upstream_done = done >= upstream

And the actual trigger rule only fails on this

if not upstream_done

ALL_SUCCESS

The code is fairly straightforward and the concept is intuitive

num_failures = upstream - successes
if num_failures > 0:
... it fails

score 9 · Answer 3 · edited Oct 08 '19 at 15:20

9

Consider using ShortCircuitOperator for the purpose you stated.

edited Oct 08 '19 at 15:20

Adam Bethke

1,028
2
19
35

answered Mar 28 '17 at 18:59

Javed

5,904
4
46
71

AKs · Answer 4 · 2021-09-29T13:01:27.727

All operators have a trigger_rule argument which defines the rule by which the generated task gets triggered.

I used these trigger rules in the following use cases:

all_success: (default) all parents have succeeded

all_done: all parents are done with their execution.

To carry out cleanups irrespective of the upstream tasks
succeeded or failed then setting this trigger_rule to ALL_DONE is always useful.

one_success: fires as soon as at least one parent succeeds, it does not wait for all parents to be done

To trigger external DAG after successful completion of the single upstream parent.

one_failed: fires as soon as at least one parent has failed, it does not wait for all parents to be done

To trigger the alerts once at least one parent fails or for any other use case.

Reference

Updated link: https://airflow.apache.org/docs/apache-airflow/stable/concepts/dags.html#concepts-trigger-rules — karuhanga, Sep 29 '21 at 12:21

What is the difference between airflow trigger rule "all_done" and "all_success"?

4 Answers4

Linked