I have a tiny cluster composed of 1 master (namenode, secondarynamenode, resourcemanager) and 2 slaves (datanode, nodemanager).
I have set in the yarn-site.xml of the master :
yarn.scheduler.minimum-allocation-mb
: 512yarn.scheduler.maximum-allocation-mb
: 1024yarn.scheduler.minimum-allocation-vcores
: 1yarn.scheduler.maximum-allocation-vcores
: 2
I have set in the yarn-site.xml of the slaves :
yarn.nodemanager.resource.memory-mb
: 2048yarn.nodemanager.resource.cpu-vcores
: 4
Then in the master, I have set in mapred-site.xml :
mapreduce.map.memory.mb
: 512mapreduce.map.java.opts
: -Xmx500mmapreduce.map.cpu.vcores
: 1mapreduce.reduce.memory.mb
: 512mapreduce.reduce.java.opts
: -Xmx500mmapreduce.reduce.cpu.vcores
: 1
So it is my understanding that when running a job, the mapreduce ApplicationMaster will try to create as many containers of 512 Mb and 1 vCore on both slaves, which have only 2048 Mb and 4 vCores available each, which gives space for 4 containers on each slave. This is precisely what is happening on my jobs, so no problem so far.
However, when i increment the mapreduce.map.cpu.vcores
and mapreduce.reduce.cpu.vcores
from 1 to 2, there should theoretically be only enough vCores available for creating 2 containers per slave right ? But no, I still have 4 containers per slave.
I then tried to increase the mapreduce.map.memory.mb
and mapreduce.reduce.memory.mb
from 512 to 768. This leaves space for 2 containers (2048/768=2).
It doesn't matter if the vCores are set to 1 or 2 for mappers and reducers, this will always produce 2 containers per slave with 768mb and 4 containers with 512mb. So what are vCores for ? The ApplicationMaster doesn't seem to care.
Also, when setting the memory to 768 and vCores to 2, I have this info displayed on nodemanager UI for a mapper container :
The 768 Mb has turned into 1024 TotalMemoryNeeded, and the 2 vCores are ignored and displayed as 1 TotalVCoresNeeded.
So to break down the "how does it work" question into multiple questions :
- Is only memory used (and vCores ignored) to calculate the number of containers ?
- Is the
mapreduce.map.memory.mb
value only a completely abstract value for calculating the number of containers (and that's why it can be rounded up to the next power of 2) ? Or does it represent real memory allocation in some way ? - Why do we specify some -Xmx value in
mapreduce.map.java.opts
? Why doesn't yarn use the value frommapreduce.map.memory.mb
to allocate memory to the container ? - What is TotalVCoresNeeded and why is it always equal to 1 ? I tried to change
mapreduce.map.cpu.vcores
in all nodes (master and slaves) but it never changes.