From my experience with Airflow
in LocalExecutor
, merely changing the
- DAG-definition file (possible modifying structure of DAG)
- Code of Operators
- Inputs of DAG / Tasks (like
Variables
, Connection
s)
does NOT require restarting Airflow services (webserver
and scheduler
)
- Its only when you change the scheduling parameters of your DAG, namely
start_date
and schedule_interval
, that renaming of dag_id
is required
- I've read suggestions that if I don't want to rename my DAG, restarting
Airflow
services would also do the trick. But I've found that this claim is inconsistent (does not work always)
do note that above facts are with reference to LocalExecutor
and they may not hold true for CeleryExecutor
------------------------------------------------------------------------------
However as for Airflow
scheduler
(and virtually any long running process in general), it is recommended that it must be restarted from time to time.
The scheduler should be restarted frequently
In our experience, a long
running scheduler process, at least with the CeleryExecutor, ends up
not scheduling some tasks. We still don’t know the exact cause,
unfortunately. Fortunately, airflow has a built-in workaround in the
form of the — num_runs flag. It specifies a number of iterations for
the scheduler to run of its loop before it quits. We’re running it
with 10 iterations, Airbnb runs it with 5. Note that this will cause
problems when using the LocalExecutor.
the above article is from 2015, not sure if things have changed since