Understanding Spark Submit Yarn Client vs Cluster mode

Question

In my use case, while submitting the applications as spark-submit --master yarn --deploy-mode client, the job runs fine. However, when the same is submitted spark-submit --master yarn --deploy-mode cluster, the job fails to initiate.

val conf = new SparkConf().setAppName("sample")
val sc = new SparkContext(conf)
val lines = sc.textFile("filepath")

I understand, I need to use SparkSession with Spark versions>2.0. However, could that be the only difference causing the issue. I am using an EMR to run the code, config

Master: 1 Core: 5 Nodes

Some questions, reading through some blogs/sites, looks like the cluster does get utilized irrespective the of the Spark-submit mode and the fact that the driver program kicks off differently in both submit modes. But however, i dont seem to understand the exact difference in approach. can some help be provided to understand what really slows down the execution, when "client" mode is used compared to "cluster".

@user9613318 Yes, i went through this solution. So, can we finalize saying, its just a trial and error approach here, rather than a proper path to follow. Also, can you help on the other question, why would my app fail on cluster mode while it runs successfully in client mode. Also, doesn't yarn client memory face an OOM issue, while running in client mode. — Karthik k, May 14 '18 at 17:24
As per understanding the difference, it's a dupe. Otherwise, if you are facing an issue that you'd need to solve, rather create and [MVCE](https://stackoverflow.com/help/mcve) with the error message so we can try to help or point to a solution. Your problem isn't salvageable otherwise. — eliasah, May 15 '18 at 09:28
Thanks @eliasah . I think i have an understanding on how the Spark program would be distributed in cluster mode vs client mode, where it runs outside of the cluster. The major issue that could surface is when the results are collected. — Karthik k, May 23 '18 at 11:24
That's why I added "Otherwise, if you are facing an issue that you'd need to solve, rather create and MVCE with the error message so we can try to help or point to a solution. Your problem isn't salvageable otherwise." — eliasah, May 23 '18 at 12:23

Understanding Spark Submit Yarn Client vs Cluster mode

0 Answers0