2

I am using Spark on Hadoop and want to know how Spark allocates the virtual memory to executor.

As per YARN vmem-pmem, it gives 2.1 times virtual memory to the container.

Hence - if XMX is 1 GB then --> 1 GB * 2.1 = 2.1 GB is allocated to the container.

How does it work on Spark? And is the below statement is correct?

If I give Executor memory = 1 GB then,

Total virtual memory = 1 GB * 2.1 * spark.yarn.executor.memoryOverhead. Is this true?

If not, then how is virtual memory for an executor calculated in Spark?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Anuj
  • 61
  • 3

1 Answers1

0

For Spark executor resources, yarn-client and yarn-cluster modes use the same configurations:

Enter image description here

In spark-defaults.conf, spark.executor.memory is set to 2 GB.

I got this from: Resource Allocation Configuration for Spark on YARN

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
backtrack
  • 7,996
  • 5
  • 52
  • 99
  • As per http://spark.apache.org/docs/latest/running-on-yarn.html executorMemory * 0.10, with minimum of 384. I want to know which one is correct. But the main question that I have is whether in case of SPARK does YARN uses vmem-pmem (default 2.1) to calculate the virtual memory and then adds the overhead to that memory? In such case for 1GB of memory to SPARK container- the total virtual memory becomes 1 * 2.1 * overhead? Is this statement correct? – Anuj Nov 01 '16 at 08:59
  • As per the problem in http://stackoverflow.com/questions/31646679/ever-increasing-physical-memory-for-a-spark-application-in-yarn Configurations are: spark.executor.memory = 32 GB spark.yarn.executor.memoryOverhead = 6 GB How come virtual memory is calculated as 152 GB(as per the error message)? Is it (32 + 6) * 4 = 152 where 4 is yarn.nodemanager.vmem-pmem-ratio? – Anuj Nov 15 '16 at 05:12