0

Can anyone please tell me how to connect to HBase from spark using kerberos. Am using the below code to create connection to HBase but its still having issues.

 val genericMessage = messages.mapPartitions(iter => {
      val context = TaskContext.get
      logger.info((s"log - Process for partition: ${context.partitionId} "))
      val partitionId: Int = context.partitionId

      val conf: Configuration = HBaseConfiguration.create()
      conf.set("hbase.zookeeper.quorum", ZOOKEEPER_QOURUM)
      conf.set("hbase.rpc.timeout", "1800000")
      conf.set("hbase.client.scanner.timeout.period", "1800000")
      conf.set("hadoop.security.authentication", "kerberos")
      import org.apache.hadoop.security.UserGroupInformation
      UserGroupInformation.loginUserFromKeytab("customer@mail.com", keyTab)
      val connection: Connection = ConnectionFactory.createConnection(conf)
      val custTable Table = connection.getTable(TableName.valueOf("prod:cusotmer"))

      val avroMessages = iter.map(msg => (msg._1, enrichment(rec._1, decodeBinaryToAvro(rec._2)))
      connection.close()
      custTable.close()
      avroMessages
    })

Thanks

Kiran
  • 451
  • 1
  • 6
  • 23
  • Spark executors have to connect individually to HBase, but they don't have a Kerberos ticket _(except in `local` mode)_. Is that the kind of "issue" you are talking about?? – Samson Scharfrichter Nov 15 '18 at 18:37
  • Also, RTFM > https://hbase.apache.org/book.html#spark > that HBase module (contributed by Cloudera) handles Kerberos out of the box. – Samson Scharfrichter Nov 15 '18 at 18:42
  • @SamsonScharfrichter, thanks for the reply. I tried to use --principal and --keytab options but still it didnt work. Its throwing NullPointer Exception when I use these options. And if I use the below in code then its throwing Unable to obtain password from user. System.setProperty("java.security.krb5.conf", "/etc/krb5.conf") UserGroupInformation.loginUserFromKeytab("customer@CREALM.NET", keyTab) – Kiran Nov 16 '18 at 06:59
  • Don't try to manage Kerberos authentication inside _executor_ code. The proper way is to let the Spark init obtain "tokens" for HDFS, Hive, HBase etc then broadcast the "tokens" to each executor; the trick is token-based authentication in HBase is not documented... so use the Spark-HBase connector instead, with `HBaseContext`, unless you are ready to read the Spark and HBase code on GitHub and also have several years of experience with Kerberos and the scars to prove it. – Samson Scharfrichter Nov 16 '18 at 17:08
  • Recommended reading: https://stackoverflow.com/questions/44265562/spark-on-yarn-secured-hbase – Samson Scharfrichter Nov 16 '18 at 17:18
  • Thanks @SamsonScharfrichter for the reply. Appreciate for your time. I have already tried rdd. But am getting the below issue. at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:935) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:901) Caused by: java.io.IOException: hconnection-0x614c1a0d closed at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveZooKeeperWatcher(ConnectionManager.java:1806) – Kiran Nov 16 '18 at 20:27
  • Below is the logic am using genericMessage.foreachRDD(rdd => { val getRDD = rdd.hbaseMapPartitions[(String, GenericRecord)](hBaseContext, (it, connection) => { val table = connection.getTable(TableName.valueOf("test:custTable")) it.map{ row => val key = row._1 val genericRecord = row._2 val result = table.get(new Get(Bytes.toBytes(imsi))) ..... } (key, genericRecord) } }) }) – Kiran Nov 16 '18 at 20:28
  • Hi @SamsonScharfrichter, can you please help me on this. Its of very high priority for me. I'm using HBaseContext. It works fine when I use BulkGet. But its giving me the above issue when I use hbaseMapPartitions. I think there is some issue when it tries to establish a connection in each mapPartition. I cannot use BulkGet for my requirement as the output of BulkGet is an RDD. Thanks – Kiran Nov 17 '18 at 21:33
  • Sorry, I can't. Did not touch to HBase in the last two years... – Samson Scharfrichter Nov 18 '18 at 16:41

0 Answers0