4

I am having a little problem while running similar code on the yarn-client mode as well as the yarn-cluster mode. My code executes perfectly when I run it in the client mode, but fails when made to run on the yarn-cluster node.

It throws a file not file exception, stating that pyspark.zip file could not be found. Any insight into this would be helpful.

Arnab
  • 1,037
  • 3
  • 16
  • 31

1 Answers1

4

In yarn-cluster mode, the driver runs in the Application Master (inside a YARN container). In yarn-client mode, it runs in the client.

In yarn-cluster mode, the spark-shell is not supported.

Coming back to your problem: which version of Spark are you using ? In version below 1.4, running pyspark in yarn is currently limited to yarn-client mode (see SPARK-5162)

Henri Benoit
  • 705
  • 3
  • 10
  • I am using spark 1.4.1 , so the cluster mode should not be an issue. I am using spark-submit from my client machine to run a job, but as mentioned above it runs fine in client mode but throws an exception on the master mode. – Arnab Sep 18 '15 at 12:34