0

I would like to list Hbase tables using Spark SQL.

Tried below code, but its not working. Do we need to set hbase host, zookeeper quorum etc details in the Spark sql context options?

    val sparkConf = new SparkConf().setAppName("test")
val sc= new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
val hiveContext = new HiveContext(sqlContext)
val listOfTables = hiveContext.sql("list")
listOfTables.show
Shankar
  • 8,529
  • 26
  • 90
  • 159

1 Answers1

1

AFAIK, there is no possibility from spark sql directly accessing hbase tables.

hivecontext knows only the tables which are in hivemetastore.

  • so I would suggest create externaltable from hive like below example..

CREATE TABLE users( userid int, name string, email string, notes string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ”small:name,small:email,large:notes”);

and then you can use

tbls = hiveContext.sql("show tables")
  tbls.show()
  • Alternatively you can use this approach with out spark sql.

here using HbaseAdmin we are taking row count of the table instead of that.. in your case, you can use HbaseAdmin (getTableNames())

see HbaseAdmin

Community
  • 1
  • 1
Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
  • Thanks for the answer, we are currently using Spark HBase connector from HortonWorks to read and write tables and its working fine, just wanted this for some POC, thats why i posted. – Shankar Jan 22 '17 at 08:51