How can Apache Spark or Hadoop Mapreduce request a fixed number of containers?
In Spark yarn-client mode, it can be requested by setting the configuration spark.executor.instances, which is directly related to the number of YARN containers it gets. How does Spark transform this into a Yarn parameter that is understood by Yarn?
I know by default, it can depend upon number of splits and configuration values yarn.scheduler.minimum-allocation-mb
, yarn.scheduler.minimum-allocation-vcores
. But Spark has ability to exactly request fixed number of containers. How can any AM do that?