2

I am new to spark. I am trying to compile and run a spark application that requires classes from an (external) jar file on my local machine. If I open the jar (on ~/Desktop) I can see the missing class in the local jar but when I run spark I get

NoClassDefFoundError: edu/stanford/nlp/ie/AbstractSequenceClassifier

I add the jar to the spark context like this

String[] jars = {"/home/pathto/Desktop/stanford-corenlp-3.5.0.jar"};
SparkConf conf = new SparkConf().setAppName("Simple Application").setJars(jars);

Then I try to run a submit script like this

/home/pathto/Downloads/spark-1.2.0-bin-hadoop2.4/bin/spark-submit \
  --class "SimpleApp" \
  --master local[4] \
  target/simple-project-1.0.jar \
  --jars local[4] /home/abe/Desktop/stanford-corenlp-3.5.0.jar

and hit the NoClassDefFoundError.

I get that this means that the worker threads can't find the class from the jar. But I am not sure what I am doing wrong. I have tried different syntaxes for the last line (below) but none works.

  --addJars local[4] /home/abe/Desktop/stanford-corenlp-3.5.0.jar
  --addJars local:/home/abe/Desktop/stanford-corenlp-3.5.0.jar
  --addJars local:/home/abe/Desktop/stanford-corenlp-3.5.0.jar

How can I fix this error?

bernie2436
  • 22,841
  • 49
  • 151
  • 244
  • Do you also get a `ClassNotFoundException`? http://stackoverflow.com/a/5756989/3318517 – Daniel Darabos Feb 10 '15 at 22:29
  • @DanielDarabos yes. I am getting that exception – bernie2436 Feb 11 '15 at 16:12
  • As a work around I packaged the dependencies into the main app jar and deployed that with Maven. That got it working. But the question is still open. – bernie2436 Feb 11 '15 at 18:42
  • I am confused you had any trouble finding jars in --master local[4] mode, I thought that bypassed all the jar issues. When I develop a program I have never gotten into "can't find jars" type of error until I run in standalone cluster mode. – JimLohse Dec 30 '15 at 19:47
  • Did you solve this problem? [I have the same right now](http://stackoverflow.com/questions/36400727/noclassdeffounderror-on-nodes-how-to-distribute-dependencies-to-all-nodes). – Stefan Falk Apr 04 '16 at 16:10

2 Answers2

0

Try specifying the jar file location using file:/path/to/jar/jarfile.jar/. Using local: means that the jar file has to already exist in the specified location on each worker node. For more information, see the "Advanced Dependency Management" section of the Submitting Applications documentation.

Timothy Perrigo
  • 723
  • 4
  • 18
0

You should use the complete path of your main class For ex. : com.package.MyMainClass

./bin/spark-submit --class com.MyMainClass /home/hadoop/Documents/Harish/HelloSpark-0.0.1-SNAPSHOT.jar -config /home/hadoop/Documents/Harish/Config.properties

This is what I used....also check for permissions on linux machine.

Harish Pathak
  • 1,567
  • 1
  • 18
  • 32