2

I have installed spark with 2.0 on CDH5.10 By following the link https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html

after all configuration when I hit spark2-submit --version it gives me correct version which is 2.0

however when I submit a spark job . First it says

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream

This is clearly indicating that hadoop libs are not in classpath. My question is it something wrong with my installation of spark 2. ? also once we add jars with sparkExtralibCLasspath for driver and core then it says SPARK_HADOOP_CONF Is not set. How can I verify my installation is correct ? I am also trying to understand where are my spark2 conf dirs I saw few previous question on stackoverflow like https://community.cloudera.com/t5/Cloudera-Manager-Installation/CHD-5-7-spark-shell-java-lang-ClassNotFoundException-org-apache/td-p/42209 and NoClassDefFoundError com.apache.hadoop.fs.FSDataInputStream when execute spark-shell but this doesnt help

I am using spark2-shell and spark2-submit command

some more investigation with https://community.cloudera.com/t5/Cloudera-Manager-Installation/CDH-5-5-pyspark-java-lang-NoClassDefFoundError-org-apache-hadoop/td-p/34424 shows might be If I can correctly set SPARK_EXTRA_LIB_PATH for spark2 then I can fix this issue. can somebody guide me please. Thanks

Community
  • 1
  • 1
Sam
  • 1,333
  • 5
  • 23
  • 36

0 Answers0