I'm running Spark on EMR. I submit my step on EMR with the following:
spark-submit --deploy-mode cluster --master yarn --num-executors 15 --executor-cores 3 --executor-memory 3G
Despite this, my Resource Manager UI shows that each of the 3 nodes has 4 or 6 YARN containers, each with 1 core, and 3G of memory.
The nodes each have 16 cores and 30G of RAM.
It seems that YARN creates as many 1-core/3GB containers as it can, until it runs out of memory on the node. This leaves 10+ cores unused.
Why is Spark not honouring my --executor-cores
setting?