1

I have a celery apache-airflow setup with one worker node on an EC2 instance. For code deployment, the user triggers deployment, copying the code package to s3; some infra related script copies the code package from s3 onto the instance; the instance triggers a restart of ALL the airflow-related services.

I want to make it so that whenever I need to modify the code, ideally any jobs that are currently running should still continue to run, without using a cluster setup.

I'm thinking that maybe finding a way to restart the web server and scheduler without restarting the worker could solve this problem.

Varun Wadhwa
  • 35
  • 1
  • 8

1 Answers1

0

From my experience with Airflow in LocalExecutor, merely changing the

  • DAG-definition file (possible modifying structure of DAG)
  • Code of Operators
  • Inputs of DAG / Tasks (like Variables, Connections)

does NOT require restarting Airflow services (webserver and scheduler)


  • Its only when you change the scheduling parameters of your DAG, namely start_date and schedule_interval, that renaming of dag_id is required
  • I've read suggestions that if I don't want to rename my DAG, restarting Airflow services would also do the trick. But I've found that this claim is inconsistent (does not work always)

do note that above facts are with reference to LocalExecutor and they may not hold true for CeleryExecutor

------------------------------------------------------------------------------

However as for Airflow scheduler (and virtually any long running process in general), it is recommended that it must be restarted from time to time.

The scheduler should be restarted frequently

In our experience, a long running scheduler process, at least with the CeleryExecutor, ends up not scheduling some tasks. We still don’t know the exact cause, unfortunately. Fortunately, airflow has a built-in workaround in the form of the — num_runs flag. It specifies a number of iterations for the scheduler to run of its loop before it quits. We’re running it with 10 iterations, Airbnb runs it with 5. Note that this will cause problems when using the LocalExecutor.

the above article is from 2015, not sure if things have changed since

y2k-shubham
  • 10,183
  • 11
  • 55
  • 131