- Summary: I am using
spark-submit
to submit my application to my spark cluster but the resource allocated to my application is not aligned with the parameters I specified. - Details: I am always getting 4 containers and 16 cores and 74752MB RAM(roughly 73G). And when I change
client
mode tocluster
, I only got less, e.g. 9 containers 9 cores and 45056MB RAM(roughly 44GB). I find this information on thecluster:8088
page where application info are shown. I also cross-reference with the executors tab ofspark:4044
where executors info are shown for spark application. - Below is the code snippet I use:
spark-submit --master yarn --deploy-mode client --class "$1" target/scala-2.10/recommend-assembly-0.1.jar --executor-cores 8 --num-executor 15 --driver-memory 19g
- Environment Info: Spark1.6 on yarn, Hadoop2.6. A cluster of 4 Nodes(1 being master), each with 16-core CPU and 64GB RAM(even though my node somehow only has access to 40GB RAM each).
- What I tried:
- I tried tinkering with the aforementioned paramters(e.g. num-executor), I am still getting the same amount of resource. But when I change
client
tocluster
, the resources allocated is even less. - I am suspecting that some setting of yarn is causing this. And I find Apache Hadoop Yarn - Underutilization of cores, however, it doesn't help after I change the setting in
capacity-scheduler.xml
.
- I tried tinkering with the aforementioned paramters(e.g. num-executor), I am still getting the same amount of resource. But when I change
Asked
Active
Viewed 316 times
0

Sam Chan
- 510
- 1
- 8
- 20
1 Answers
0
I think you should take knowledge about spark on yarn,include container,stage,AM etc

ZhongJin Hu
- 46
- 2
-
Actually, I have read and searched through a lot of material. Guess I am too dumb to figure out this problem. – Sam Chan Feb 12 '19 at 08:19