0

I'm trying to write a simple Scala code that queries Hive data located on a remote cluster. My code will be deployed to a clusterA but has to query a Hive table located on clusterB. I'm developing this in my local Eclipse and getting the following error

org.apache.spark.sql.AnalysisException: Table not found: `<mydatabase>`.`<mytable>`;

The relevant part of my code is below

    val conf = new SparkConf().setAppName("Xing")
    .setMaster("local[*]")
    conf.set("hive.metastore.uris","thrift://<clusterB url>:10000")
    val sc = SparkContext.getOrCreate(conf)
    val hc = new HiveContext(sc)
    val df = hc.sql("select * from <mydatabase>.<mytable>")

I suspect it is a configuration issue but I may be wrong. Any advise would be greatly appreciated.

Michael D
  • 181
  • 1
  • 1
  • 13
  • Can you run beeline and access the same HiveServer/database/table? –  Nov 23 '16 at 15:38
  • I can query this table using Hive JDBC with no problems. This cluster has Kerberos security setup. I was trying to set the same properties in SparkConf but had the same error. These are the properties I'm setting: conf.set("login.user","") conf.set("keytab.file", "") conf.set("sun.security.krb5.debug","false") conf.set("java.security.krb5.conf","") conf.set("java.library.path","") conf.set("hadoop.home.dir","") conf.set("hadoop.security.authentication","kerberos") – Michael D Nov 23 '16 at 16:44

1 Answers1

0

The port in the metastore URL should be 9083, unless you purposely changed it. 10000 is for hiveserver2.

Lokesh Yadav
  • 958
  • 2
  • 9
  • 20