0

I have Apache Airflow running on an EC2 instance (Ubuntu). Everything is running fine. The DB is SQLite and the executor is Sequential Executor (provided as default). But now I would like to run some DAGs which needs to be run at the same time every hour and every 2 minutes. My question is how can I upgrade my current setup to Celery executor and postgres DB to have the advantage of parallel execution?

Will it work, if I install and setup the postgres, rabbitmq and celery. And make the necessary changes in the airflow.cfg configuration file?

Or do I need to re-install everything from scratch (including airflow)?

Please guide me on this.

Thanks

Aakash
  • 39
  • 10

1 Answers1

1

You can, indeed, install Postgres/RabbitMQ/Celery, then update your configuration file (airflow.cfg), initialise the database, and restart the Airflow services.

However, there is a side note: if required, you'd also have to migrate data from SQLite to Postgres. Most importantly, the database contains your connections and variables. It's possible to export variables beforehand and import them again using the Airflow CLI (see this answer, and the Airflow documentation).

It's also possible to import your connections using the CLI, as described in this Airflow guide (or the documentation).

If you just switched to the new database set up and you see something's missing, you can still easily switch back to the SQLite setup by reverting the changes to airflow.cfg.

bartcode
  • 589
  • 4
  • 14