2

I am getting error when I am trying to connect hive table (which is being created through HbaseIntegration) in spark

Steps I followed : Hive Table creation code :

CREATE TABLE test.sample(id string,name string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH     
SERDEPROPERTIES ("hbase.columns.mapping" = ":key,details:name")
TBLPROPERTIES ("hbase.table.name" = "sample");

DESCRIBE TEST ;

 col_name data_type comment
 id string from deserializer
 name string from deserializer

Starting Spark shell with this command:

spark-shell --master local[2] --driver-class-path /usr/local/hive/lib/hive-   
hbase-handler-1.2.1.jar:
/usr/local/hbase/lib/hbase-server-0.98.9-  
hadoop2.jar:/usr/local/hbase/lib/hbase-protocol-0.98.9-hadoo2.jar:
/usr/local/hbase/lib/hbase-hadoop2-compat-0.98.9-  
hadoop2.jar:/usr/local/hbase/lib/hbase-hadoop-compat-0.98.9-hadoop2.jar:
/usr/local/hbase/lib/hbase-client-0.98.9-   
hadoop2.jar:/usr/local/hbase/lib/hbase-common-0.98.9-hadoop2.jar:
/usr/local/hbase/lib/htrace-core-2.04.jar:/usr/local/hbase/lib/hbase-common-  
0.98.9-hadoop2-tests.jar:
/usr/local/hbase/lib/hbase-server-0.98.9-hadoop2-  
tests.jar:/usr/local/hive/lib/zookeeper-3.4.6.jar:/usr/local/hive/lib/guava-  
14.0.1.jar

In spark-shell:

val sqlContext=new org.apache.spark.sql.hive.HiveContext(sc)

sqlContext.sql(“select count(*) from test.sample”).collect()

Stack Trace :

Stack SQL context available as sqlContext.

scala> sqlContext.sql("select count(*) from test.sample").collect()

16/09/02 04:49:28 INFO parse.ParseDriver: Parsing command: select count(*) from test.sample
16/09/02 04:49:35 INFO parse.ParseDriver: Parse Completed
16/09/02 04:49:40 INFO metastore.HiveMetaStore: 0: get_table : db=test tbl=sample
16/09/02 04:49:40 INFO HiveMetaStore.audit: ugi=hdfs    ip=unknown-ip-addr  cmd=get_table : db=test tbl=sample  
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes
    at org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
    at org.apache.hadoop.hive.hbase.HBaseSerDeParameters.<init>(HBaseSerDeParameters.java:73)
    at org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
    at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
    at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
    at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:258)
    at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:605)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1$$anonfun$3.apply(ClientWrapper.scala:331)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1$$anonfun$3.apply(ClientWrapper.scala:326)
    at scala.Option.map(Option.scala:145)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1.apply(ClientWrapper.scala:326)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1.apply(ClientWrapper.scala:321)
    at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:279)
    at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:226)
    at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:225)
    at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:268)
    at org.apache.spark.sql.hive.client.ClientWrapper.getTableOption(ClientWrapper.scala:321)
    at org.apache.spark.sql.hive.client.ClientInterface$class.getTable(ClientInterface.scala:122)
    at org.apache.spark.sql.hive.client.ClientWrapper.getTable(ClientWrapper.scala:60)
    at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:384)
    at org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:457)
    at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:161)
    at org.apache.spark.sql.hive.HiveContext$$anon$2.lookupRelation(HiveContext.scala:457)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:303)

I am using hadoop 2.6.0, spark 1.6.0, hive 1.2.1, hbase 0.98.9

I added this setting in hadoop-env.sh as

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/lib/*

Can some body please suggest any solution

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
user6608138
  • 381
  • 1
  • 4
  • 20
  • `java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes`, check you class path – GoingMyWay Sep 03 '16 at 07:12
  • thank you Alexander for your reply, I added classpath as, exportSPARK_HOME=/usr/local/spark export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin export SPARK_CLASSPATH=$SPARK_HOME/lib:$HBASE_HOME/lib:$HIVE_HOME/lib can you please suggest me is there is any mistake what i did. – user6608138 Sep 07 '16 at 07:39
  • I am new to spark.Now I am able to query Hive managed tables through SparkSQL.But I don't know how to query HbaseStorage Handler tables of hive through SparkSQL.Can you please guide me. Thank you Alexander. – user6608138 Sep 07 '16 at 08:16
  • sorry, I don't know HBase, if you have questions on HBass, try to search on google or ask a new question to get help! – GoingMyWay Sep 08 '16 at 01:25
  • Thank You Alexander,for your reply. – user6608138 Sep 08 '16 at 04:23

2 Answers2

0
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes

because hbase related jars are not there in classpath

 export HADOOP_CLASSPATH=$HADOOP_CLASSPATH: `hbase classpath`  

should include all hbase related jar files or else see my answer here using --jars

Note : To verify the classpath you can add below code in the driver to print all the classpath resources

scala version :

val cl = ClassLoader.getSystemClassLoader
    cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)

java :

import java.net.URL;

import java.net.URLClassLoader;
...

   ClassLoader cl = ClassLoader.getSystemClassLoader();

        URL[] urls = ((URLClassLoader)cl).getURLs();

        for(URL url: urls) {
            System.out.println(url.getFile());
        }
Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
  • Hello, Even I am facing same problem. The above solution is not working. – Rohan Nayak Oct 18 '17 at 06:59
  • @RohanNayak : Raise new question describing the environment and your problem. this is already 1yr+ old question – Ram Ghadiyaram Oct 18 '17 at 07:02
  • @RohanNayak : What is the output of this command? `hbase classpath` with append backticks as prefix and suffix – Ram Ghadiyaram Oct 18 '17 at 07:14
  • @RohanNayak : classpath errors are tricky and might be environment specific .... updated answer as how to verify classpath – Ram Ghadiyaram Oct 18 '17 at 07:26
  • HI @Ram . I have created new thread, but no answer yet. .https://stackoverflow.com/questions/46793327/spark-sqlcontext-and-hbase-java-lang-noclassdeffounderror-org-apache-hadoop-hb – Rohan Nayak Oct 23 '17 at 09:40
  • @RohanNayak : Sure try to use --jars option – Ram Ghadiyaram Oct 23 '17 at 10:17
  • I did use all the jar using --jars. The earlier "org/apache/hadoop/hbase/util/Bytes" error solved, But now I get some strange error : Caused by: java.lang.ClassNotFoundException: com.google.common.primitives.Bytes – Rohan Nayak Oct 23 '17 at 12:06
  • @RohanNayak : simple google guava jar is not in class path... check [Bytes](https://google.github.io/guava/releases/19.0/api/docs/com/google/common/primitives/Bytes.html) – Ram Ghadiyaram Oct 23 '17 at 12:12
  • can you chcek this way whether guava is in classpath or not ... `val cl = ClassLoader.getSystemClassLoader cl.asInstanceOf[java.net.URLClassLoader].getURLs.foreach(println)` – Ram Ghadiyaram Oct 23 '17 at 12:14
0

I got it working . You have to use below jars .

spark-shell --master yarn-client --executor-cores 10 --executor-memory 20G --num-executors 15 --driver-memory 2G --driver-class-path /usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar --jars /usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar,/usr/hdp/current/hive-client/lib/hive-hbase-handler.jar --files /etc/spark/conf/hbase-site.xml

Rohan Nayak
  • 233
  • 4
  • 14