Is there any way to specify complete folder path of the jars to be pushed on driver as well as executor like --jars
in spark-submit
, which excepts comma separated jar names with full path. But it's tedious work if we do have too many jars to be pushed on both driver as well as executor.

- 890
- 13
- 26

- 633
- 1
- 9
- 27
2 Answers
Question : Is there a way to specify to push complete jar folder on both driver and executors?
Yes you can make uber jar which is self contained distribution with all depedencies packed inside.
sample if you are using maven, you can use maven shade
plugin or assembly
plugin
for this. below is shade example.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.maventest</groupId>
<artifactId>mytest</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>mytest</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.3</version>
<scope>compile</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<finalName>uber-${artifactId}-${version}</finalName>
</configuration>
</plugin>
</plugins>
</build>
</project>
If you are using sbt see this
your spark submit will look like ....
spark-submit [PATH_TO_YOUR_UBER_JAR]/[YOUR_UBER_JAR].jar
Further reading for example Googles article : Managing Java dependencies for Apache Spark applications

- 28,239
- 13
- 95
- 121
Running spark on yarn you have to be able to set spark.yarn.archive
or spark.yarn.jars
in spark-defaults.conf
configuration file.
spark.yarn.archive
is intended for distribution of the archive with all the jars you need on your executors.
spark.yarn.jars
is for separate jars.
You may find more information in the official docs.

- 1
- 1