1

I have a local spark 1.5.2 (hadoop 2.4) installation on Windows as explained here.

I'm trying to import a jar file that I created in Java using maven (the jar is jmatrw that I uploaded on here on github). Note the jar does not include a spark program and it has no dependencies to spark. I tried the following steps, but no one seems to work in my installation:

  • I copied the library in "E:/installprogram/spark-1.5.2-bin-hadoop2.4/lib/jmatrw-v0.1-beta.jar"
  • Edit spark-env.sh and add SPARK_CLASSPATH="E:/installprogram/spark-1.5.2-bin-hadoop2.4/lib/jmatrw-v0.1-beta.jar"
  • In a command window I run > spark-shell --jars "E:/installprogram/spark-1.5.2-bin-hadoop2.4/lib/jmatrw-v0.1-beta.jar", but it says "Warning: skip remote jar"
  • In the spark shell I tried to do scala> sc.addJar("E:/installprogram/spark-1.5.2-bin-hadoop2.4/lib/jmatrw-v0.1-beta.jar"), it says "INFO: added jar ... with timestamp"

When I type scala> import it.prz.jmatrw.JMATData, spark-shell replies with error: not found: value it.

I spent lot of time searching on Stackoverflow and on Google, indeed a similar Stakoverflow question is here, but I'm still not able to import my custom jar.

Thanks

Kirk Broadhurst
  • 27,836
  • 16
  • 104
  • 169
Donato Pirozzi
  • 759
  • 2
  • 10
  • 19

1 Answers1

2

There are two settings in 1.5.2 to reference an external jar. You can add it for the driver or for the executor(s).

I'm doing this by adding settings to the spark-defaults.conf, but you can set these at spark-shell or in SparkConf.

spark.driver.extraClassPath /path/to/jar/*
spark.executor.extraClassPath /path/to/jar/*

I don't see anything really wrong with the way you are doing it, but you could try the conf approach above, or setting these using SparkConf

val conf = new SparkConf()
conf.set("spark.driver.extraClassPath", "/path/to/jar/*")
val sc = new SparkContext(conf)

In general, I haven't enjoyed working with Spark on Windows. Try to get onto Unix/Linux.

Kirk Broadhurst
  • 27,836
  • 16
  • 104
  • 169
  • I added the setting `spark.driver.extraClassPath` to the `spark-defaults.conf` file, and it works! Actually, the SPARK_CLASSPATH was deprecated in Spark 1.0+, as also suggested in the spark logging warnings. Thanks for your time spent reading and answering my question. – Donato Pirozzi Dec 30 '15 at 11:01