yarn allocate containers for Spark in AWS

Question

I was able to create YARN containers for my spark jobs. I have come across various blogs and youtube videos to efficiently use --executors-cores (use values from 4 -6 for efficient throughput) and --executor memory after reserving 1 CPU cores and 1GB RAM for hadoop deamons and determined the right values for each executor.

I also came across articles like these.

I am checking how many containers are created by YARN from spark shell and i am not able to understand how the containers are allocated.

For example i have created EMR cluster with 1 master node m5.xlarge (4 vcore , 16 Gib) and 1 core node with instance type c5.2xlarge ( 8 vcore and 16 Gib RAM)

When i create the spark shell with the following command spark-shell --num-executors=6 --executor-cores=5 --conf spark.executor.memoryOverhead=1G --executor-memory 1G --driver-memory 1G

i see that 6 executors including a driver are being created with 5 cores for each executor for a total of 25 cores

However the metrics from hadoop history server does not reflect the right calculations

I am very confused how in spark UI , more cores than available were allocated for each executor . The total vcores in the cluster is 8 cores considering the core nodes but a total of 25 executors are allocated for the executors.

Can someone please explain what i am missing.

Strange. I guess next thing is to check driver log for messages like `yarn.YarnAllocator: Will request 6 executor container(s), each with X core(s) and XXXX MB memory...` and see what kind of containers YARN actually allocated. — mazaneicha, Oct 17 '22 at 21:44
i see the below logs ```22/10/18 21:03:25 INFO Utils: Using initial executors = 6, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances 22/10/18 21:03:25 INFO YarnAllocator: Will request 6 executor container(s), each with 5 core(s) and 2048 MB memory (including 1024 MB of overhead) 22/10/18 21:03:25 INFO YarnAllocator: Submitted 6 unlocalized container requests.``` — PSD, Oct 18 '22 at 21:05
And then? Do you see `ExecutorAllocationManager: New executor N has registered (new total is N)` for N 1 thru 6? Before that, do you see your node manager in `YarnAllocator: Launching container container_e000_000000000000_000000_01_00000x on host your.host.your.domain for executor with ID N`? — mazaneicha, Oct 18 '22 at 22:06

yarn allocate containers for Spark in AWS

0 Answers0