1

Say I want to increase the default value for YARN container from 1024 MB to 1200MB and make all YARN containers memory be a multiple of 1200MB (2400MB, 3600MB, so on).

I can control max size and min size of the container using YARN parameters yarn.scheduler.minimum-allocation-mb and yarn.scheduler.maximum-allocation-mb as it is stated in Hadoop The Definitive Guide.
I believed that allocation increments are same as yarn.scheduler.minimum-allocation-mb (see this answer), until recently came across mentions of yarn.scheduler.increment-allocation-mb parameter:

Request a container 1200MB/1vcore: minimum size is 1GB, increment is 500MB -> a container of 1.5GB (rounded up to the next increment, the minimum is used as a base)

I did not find any mentions nor default values of this parameter in yarn-defaults.xml for Hadoop 3.1.1 not to mention the older versions.

So my questions are: do I need to set yarn.scheduler.increment-allocation-mb explicitly to 1200MB in yarn-site.xml and what is the default value for this property?

Just to add more details, my Hadoop version is 2.6.0-cdh5.9.2 (Cloudera distribution).

GoodDok
  • 1,770
  • 13
  • 28

1 Answers1

1

According to the Cloudera docs, the default is 512 MB.

Yes, you'll need to set yarn.scheduler.increment-allocation-mb to 1200MB in order to have container size incremented in multiples of that.

mazaneicha
  • 8,794
  • 4
  • 33
  • 52
  • @GoodDok Just out of curiosity - why do you need it? – mazaneicha Oct 23 '19 at 14:53
  • we've got relation vmemory/vcores ~ 1200MB in our cluster, so the container size was set to that value some time ago + all of the sudden `yarn.scheduler.maximum-allocation-mb` was set to the same value. Nevertheless, now I investigate how to get away from this configuration without breaking the existing logic. – GoodDok Oct 23 '19 at 15:27
  • 1
    I see. Just wanted to warn that if you're running a lot of Spark apps with default configs (https://spark.apache.org/docs/2.3.0/configuration.html), AMs will be requesting containers with 1.384 GB (`spark.executor.memory`=1g and `spark.executor.memoryOverhead`=384m by default) but YARN with your new setting will promote contianer size to 2400m, so you will end up with less executor containers. – mazaneicha Oct 23 '19 at 15:58