How to run spark shell with local packages?

Question

The page here (http://spark.apache.org/docs/latest/programming-guide.html) indicates packages can be included when the shell is launched via:

$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.11:1.4.0

What is the syntax for including local packages (that are downloaded manually say)? Something to do with Maven coords?

You mean the packages are already locally available in the master/workers? — Yuval Itzchakov, Jun 05 '16 at 15:52
Yes, I am thinking of a kind of global consistently mounted file system. I should have mentioned that. So all workers see the same directory. — mathtick, Jun 05 '16 at 21:08
For anyone reading this now, spark-csv has now been inlined into spark 2.0. — mathtick, Nov 20 '16 at 19:52

score 4 · Accepted Answer · edited May 23 '17 at 11:54

If the jars are present on the master/workers, you simply need to specify them on the classpath in spark-submit:

spark-shell \
spark.driver.extraClassPath="/path/to/jar/spark-csv_2.11.jar" \
spark.executor.extraClassPath="spark-csv_2.11.jar"

If the jars are only present in the Master, and you want them to be sent to the worker (only works for client mode), you can add the --jars flag:

spark-shell \
spark.driver.extraClassPath="/path/to/jar/spark-csv_2.11.jar" \
spark.executor.extraClassPath="spark-csv_2.11.jar" \
--jars "/path/to/jar/jary.jar:/path/to/other/other.jar"

For a more elaborated answer see Add jars to a Spark Job - spark-submit

score 2 · Answer 2 · edited May 23 '17 at 11:46

2

Please use:

./spark-shell --jars my_jars_to_be_included

There is a open question related to this: Please check this question out.

edited May 23 '17 at 11:46

Community

1
1

answered Jun 06 '16 at 00:40

dbustosp

4,208
25
46

1

it should be `--jars` not `--jar` – Philip K. Adetiloye Apr 24 '17 at 00:47

How to run spark shell with local packages?

2 Answers2

Linked

How to run spark shell with *local* packages?

2 Answers2

Linked

How to run spark shell with local packages?