11

I can't understand the difference between dag_concurrency and parallelism. documentation and some of the related posts here somehow contradicts my findings.

The understanding I had before was that the parallelism parameter allows you to set the MAX number of global(across all DAGs) TaskRuns possible in airflow and dag_concurrency to mean the MAX number of TaskRuns possible for a single Dag.

So I set the parallelism to 8 and dag_concurrency to 4 and ran a single Dag. And I found out that it was running 8 TIs at a time but I was expecting it to run 4 at a time.

  1. How is that possible?

  2. Also, if it helps, I have set the pool size to 10 or so for these tasks. But that shouldn't have mattered as "config" parameters are given higher priorities than the pool's, Right?

Nazim Kerimbekov
  • 4,712
  • 8
  • 34
  • 58
SpaceyBot
  • 482
  • 1
  • 4
  • 20

2 Answers2

5

The other answer is only partially correct:

dag_concurrency does not explicitly control tasks per worker. dag_concurrency is the number of tasks running simultaneously per dag_run. So if your DAG has a place where 10 tasks could be running simultaneously but you want to limit the traffic to the workers you would set dag_concurrency lower.

The queues and pools setting also have an effect on the number of tasks per worker.

These setting are very important as you start to build large libraries of simultaneously running DAGs.

parallelism is the maximum number of tasks across all the workers and DAGs.

trejas
  • 991
  • 7
  • 17
3

parallelism is better seen as max_active_tasks_total. You set it to 8, saying "I only want 8 tasks running at one time between all the workers".

dag_concurrency is better as max_active_tasks_for_worker. You set it to 4, saying "I only want each worker to run 4 task instances at a time, max".

So when you ran your dag, it was running 8 total task instances between 2 two workers, with each worker running 4 tasks. I think you were just messed up on dag_concurreny.

This answer was partially taken from this SO answer here: SO Answer

Zack
  • 2,296
  • 20
  • 28