Google Cloud Composer: Relation of core:parallelism, Executor Open slots and task execution speed

Question

I need some clarification on relation of AirFlow core:parallelism, Executor Open slots and its impact on the task duration.

Based on my experiment, I noticed that the number of Executor Open Slots is directly proportional to the core:parallelism value of the Google Cloud Composer Airflow Configuration setting. For e.g. for a 3 node composer, the default parallelism as well as the number of slots is 30. With this config, I added a single DAG in entire composer with a specific task. I took average of 10 readings for the single task duration which came to say 15 seconds.

Next I reduced the parallelism to 10 and hence the open slots also to 10. Executed the same DAG with same task. The average time duration remained the same i.e. 15 seconds.

My question: The default config is 30. Suppose my application needs to execute only the above DAG with no other DAGs present on the composer, I would like to take full benefit of the Composer resources. In other words, my expectation was that on reducing the parallelism, the reduced number of slots would be of higher capacity than before since the composer's CPU RAM would be consumed by lesser number of slots. But the experiment shows that average task duration still remains the same.

Could someone please clarify whether reducing parallelism makes the single task run faster?

Hello! I would like to share with you link to the another SO thread: https://stackoverflow.com/questions/56370720/how-can-i-control-the-parallelism-or-concurrency-of-an-airflow-dag written by one of Google engineers that gives more information on these config settings. Is that helpful for you? — aga, Jul 29 '20 at 16:11
Thank you Muscat. Your link is indeed useful. However, one open query is does reducing parallelism increase the speed of a single task... .i.e. does it get benefit of entire CPU and Memory resources? Also how can I reach out to mrsrinivas mentioned in your link? — Bhushan Khadkikar, Aug 01 '20 at 18:32
Configuring Composer worker_concurrency you can reduce the number of tasks scheduled on a node, in order to give more memory to each task. The trade off here, of course, is reduced parallelism. To reach out to mrsrinivas you should post a comment and refer to nickname with `@` sign. — aga, Aug 12 '20 at 12:50

Google Cloud Composer: Relation of core:parallelism, Executor Open slots and task execution speed

0 Answers0