28

I have a Hadoop cluster with 5 nodes, each of which has 12 cores with 32GB memory. I use YARN as MapReduce framework, so I have the following settings with YARN:

  • yarn.nodemanager.resource.cpu-vcores=10
  • yarn.nodemanager.resource.memory-mb=26100

Then the cluster metrics shown on my YARN cluster page (http://myhost:8088/cluster/apps) displayed that VCores Total is 40. This is pretty fine!

Then I installed Spark on top of it and use spark-shell in yarn-client mode.

I ran one Spark job with the following configuration:

  • --driver-memory 20480m
  • --executor-memory 20000m
  • --num-executors 4
  • --executor-cores 10
  • --conf spark.yarn.am.cores=2
  • --conf spark.yarn.executor.memoryOverhead=5600

I set --executor-cores as 10, --num-executors as 4, so logically, there should be totally 40 Vcores Used. However, when I check the same YARN cluster page after the Spark job started running, there are only 4 Vcores Used, and 4 Vcores Total

I also found that there is a parameter in capacity-scheduler.xml - called yarn.scheduler.capacity.resource-calculator:

"The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. DefaultResourceCalculator only uses Memory while DominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc."

I then changed that value to DominantResourceCalculator.

But then when I restarted YARN and run the same Spark application, I still got the same result, say the cluster metrics still told that VCores used is 4! I also checked the CPU and memory usage on each node with htop command, I found that none of the nodes had all 10 CPU cores fully occupied. What can be the reason?

I tried also to run the same Spark job in fine-grained way, say with --num executors 40 --executor-cores 1, in this ways I checked again the CPU status on each worker node, and all CPU cores are fully occupied.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
Rui
  • 3,454
  • 6
  • 37
  • 70

3 Answers3

5

I was wondering the same but changing the resource-calculator worked for me.
This is how I set the property:

    <property>
        <name>yarn.scheduler.capacity.resource-calculator</name>      
        <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>       
    </property>

Check in the YARN UI in the application how many containers and vcores are assigned, with the change the number of containers should be executors+1 and the vcores should be: (executor-cores*num-executors) +1.

2

Without setting the YARN scheduler to FairScheduler, I saw the same thing. The Spark UI showed the right number of tasks, though, suggesting nothing was wrong. My cluster showed close to 100% CPU usage, which confirmed this.

After setting FairScheduler, the YARN Resources looked correct.

Def_Os
  • 5,301
  • 5
  • 34
  • 63
  • Please explain how you are doing that. What is the name of the config ? – Wonay Sep 06 '18 at 23:59
  • 1
    @Wonay , I used this: https://www.cloudera.com/documentation/enterprise/5-8-x/topics/admin_fair_scheduler.html – Def_Os Sep 14 '18 at 23:08
0

Executors take 10 cores each, 2 cores for Application Master = 42 Cores requested when you have 40 vCores total.

Reduce executor cores to 8 and make sure to restart each NodeManager

Also modify yarn-site.xml and set these properties:

yarn.scheduler.minimum-allocation-mb
yarn.scheduler.maximum-allocation-mb
yarn.scheduler.minimum-allocation-vcores
yarn.scheduler.maximum-allocation-vcores
Artur Sukhenko
  • 602
  • 3
  • 12