When executing spark-submit, path to jar needs to point to HDFS?

Question

When executing spark-submit command, path to JAR needs to point to a HDFS location ?

Maybe you don't have rights to upload the package in HDFS but still want to execute a Spark job.

Possible duplicate of [Add jars to a Spark Job - spark-submit](https://stackoverflow.com/questions/37132559/add-jars-to-a-spark-job-spark-submit) — SCouto, Feb 27 '18 at 09:46

score 1 · Accepted Answer · answered Feb 27 '18 at 10:43

It depends on the deploy mode of the driver instance.

For example, if you are running spark-submit in client mode in a standalone cluster, you can specify a path in your local machine, since the Spark driver is deployed in the same machine where you execute the spark-submit command. Then, it will share the jar file with the workers.

However, if you are running spark-submit in cluster mode, you need to upload the jar in a path accessible from all the cluster nodes, such us HDFS, since in cluster mode the driver is instantiated in a arbitrary worker of the cluster.

When executing spark-submit, path to jar needs to point to HDFS?

1 Answers1