When executing spark-submit command, path to JAR needs to point to a HDFS location ?
Maybe you don't have rights to upload the package in HDFS but still want to execute a Spark job.
When executing spark-submit command, path to JAR needs to point to a HDFS location ?
Maybe you don't have rights to upload the package in HDFS but still want to execute a Spark job.
It depends on the deploy mode of the driver instance.
For example, if you are running spark-submit in client mode in a standalone cluster, you can specify a path in your local machine, since the Spark driver is deployed in the same machine where you execute the spark-submit command. Then, it will share the jar file with the workers.
However, if you are running spark-submit in cluster mode, you need to upload the jar in a path accessible from all the cluster nodes, such us HDFS, since in cluster mode the driver is instantiated in a arbitrary worker of the cluster.