0

I'm trying to scale my airflow jobs. We are deployed on k8s and we have a scheduler, webserver, flower, redis and 4 worker nodes. I updated the config map to have AIRFLOW__CORE__PARALLELISM = 40 And from what I've read (Airflow parallelism) this should mean I have around 40 tasks running at a time. But looking at some DataDog Dashboards I set up, my number of task is unchanged, and still plateaus around 32.

Is there something I am missing here?

Here are a few other variables that may be related:

AIRFLOW__CELERY__WORKER_CONCURRENCY=8
AIRFLOW_VERSION=2.1.2

Also found this pretty helpful: https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html

Robert Riley
  • 389
  • 1
  • 7
  • 31

1 Answers1

0

I suspect your issue is Pools. If all your tasks are in the default_pool and it has only 32 slots, that will cap your total number of concurrent tasks. You may also want to check out Astronomer's Scaling Airflow guide.

Collin McNulty
  • 371
  • 2
  • 6
  • Thanks! So I checked via the UI and my slots are 150, running is 6, and queued is 5, my expectation would be that 0 are queued since 6 < 150. – Robert Riley Apr 19 '23 at 21:27