0
  1. Summary: I am using spark-submit to submit my application to my spark cluster but the resource allocated to my application is not aligned with the parameters I specified.
  2. Details: I am always getting 4 containers and 16 cores and 74752MB RAM(roughly 73G). And when I change client mode to cluster, I only got less, e.g. 9 containers 9 cores and 45056MB RAM(roughly 44GB). I find this information on the cluster:8088 page where application info are shown. I also cross-reference with the executors tab of spark:4044 where executors info are shown for spark application.
  3. Below is the code snippet I use: spark-submit --master yarn --deploy-mode client --class "$1" target/scala-2.10/recommend-assembly-0.1.jar --executor-cores 8 --num-executor 15 --driver-memory 19g
  4. Environment Info: Spark1.6 on yarn, Hadoop2.6. A cluster of 4 Nodes(1 being master), each with 16-core CPU and 64GB RAM(even though my node somehow only has access to 40GB RAM each).
  5. What I tried:
    1. I tried tinkering with the aforementioned paramters(e.g. num-executor), I am still getting the same amount of resource. But when I change client to cluster, the resources allocated is even less.
    2. I am suspecting that some setting of yarn is causing this. And I find Apache Hadoop Yarn - Underutilization of cores, however, it doesn't help after I change the setting in capacity-scheduler.xml.
Sam Chan
  • 510
  • 1
  • 8
  • 20

1 Answers1

0

I think you should take knowledge about spark on yarn,include container,stage,AM etc

  • Actually, I have read and searched through a lot of material. Guess I am too dumb to figure out this problem. – Sam Chan Feb 12 '19 at 08:19