19

I'm running Spark 2.1.0, Hive 2.1.1 and Hadoop 2.7.3 on Ubuntu 16.04.

I download the Spark project from github and build the "without hadoop" version:

./dev/make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided"

When I run ./sbin/start-master.sh, I get the following exception:

 Spark Command: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp /home/server/spark/conf/:/home/server/spark/jars/*:/home/server/hadoop/etc/hadoop/:/home/server/hadoop/share/hadoop/common/lib/:/home/server/hadoop/share/hadoop/common/:/home/server/hadoop/share/hadoop/mapreduce/:/home/server/hadoop/share/hadoop/mapreduce/lib/:/home/server/hadoop/share/hadoop/yarn/:/home/server/hadoop/share/hadoop/yarn/lib/ -Xmx1g org.apache.spark.deploy.master.Master --host ThinkPad-W550s-Lab --port 7077 --webui-port 8080
 ========================================
 Error: A JNI error has occurred, please check your installation and try again
 Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
     at java.lang.Class.getDeclaredMethods0(Native Method)
     at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
     at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
     at java.lang.Class.getMethod0(Class.java:3018)
     at java.lang.Class.getMethod(Class.java:1784)
     at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
     at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
 Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger
     at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
     ... 7 more

I edit SPARK_DIST_CLASSPATH according to the post Where are hadoop jar files in hadoop 2?

export SPARK_DIST_CLASSPATH=~/hadoop/share/hadoop/common/lib:~/hadoop/share/hadoop/common:~/hadoop/share/hadoop/mapreduce:~/hadoop/share/hadoop/mapreduce/lib:~/hadoop/share/hadoop/yarn:~/hadoop/share/hadoop/yarn/lib

But I'm still getting the same error. I can see the slf4j jar file is under ~/hadoop/share/hadoop/common/lib.

How could I fix this error?

Thank you!

Community
  • 1
  • 1
Top.Deck
  • 1,077
  • 3
  • 16
  • 31
  • Try not putting a `~` in your variable paths – OneCricketeer Feb 17 '17 at 21:20
  • @cricket_007 I replaced '~' with '/home/server' but the error still appears. – Top.Deck Feb 17 '17 at 21:26
  • 1
    I'm not really sure why you have that really long value for the variable. https://spark.apache.org/docs/latest/hadoop-provided.html – OneCricketeer Feb 17 '17 at 21:38
  • I'm new to big data so please correct me if I'm wrong. I believe [Where are hadoop jar files in hadoop 2?](http://stackoverflow.com/questions/15188042/where-are-hadoop-jar-files-in-hadoop-2) mentioned that in hadoop2 all the jar files are in different class path. – Top.Deck Feb 17 '17 at 21:41
  • I would trust the lastest Spark documentation more than a 4 year old StackOverflow post. Hadoop2 has been around for a bit longer than Spark – OneCricketeer Feb 17 '17 at 21:46
  • 2
    I tried that before and I just realise that I should run `conf/spark-env.sh` and `./sbin/start-master.sh` in the same terminal instead of in two terminal tabs. Problem is solved now. Thanks a lot! – Top.Deck Feb 17 '17 at 21:56
  • I can't remember what `start-master` does, but `spark-env` is auto-executed by other Spark commands. – OneCricketeer Feb 17 '17 at 22:06

1 Answers1

20

“Hadoop free” builds need to modify SPARK_DIST_CLASSPATH to include Hadoop’s package jars.

The most convenient place to do this is by adding an entry in conf/spark-env.sh :

export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)  

check this https://spark.apache.org/docs/latest/hadoop-provided.html

DennisLi
  • 3,915
  • 6
  • 30
  • 66
yuxh
  • 924
  • 1
  • 9
  • 23