1

i do have an issue with Kerberos credentials. This work is based on a cluster and the keytabs are provided on each datanode. Basically it is an oozie workflow shell action, and it's purpose is to write to HBase by a spark job. If the job is run on cluster mode without oozie, it works as excpected. But with oozie it throws an Exception as follows:

WARN AbstractRpcClient: Exception encountered while connecting to the server 
: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
18/11/26 15:30:24 ERROR AbstractRpcClient: SASL authentication failed. The 
most likely cause is missing or invalid credentials. Consider 'kinit'.
javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge
(GssKrb5Client.java:211)
at 
org.apache.hadoop.hbase.security.HBaseSaslRpcClient.
saslConnect(HBaseSaslRpcClient.java:179)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection
.setupSaslConnection(RpcClientImpl.java:611)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection
.access$600(RpcClientImpl.java:156)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2
.run(RpcClientImpl.java:737)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2
.run(RpcClientImpl.java:734)
at java.security.AccessController.doPrivileged(Native Method)

The oozie shell action looks like:

<action name="spark-hbase" retry-max="${retryMax}" retry-interval="${retryInterval}">
<shell xmlns="uri:oozie:shell-action:0.3">
  <exec>submit.sh</exec>
  <env-var>QUEUE_NAME=${queueName}</env-var>
  <env-var>PRINCIPAL=${principal}</env-var>
  <env-var>KEYTAB=${keytab}</env-var>
  <env-var>VERBOSE=${verbose}</env-var>
  <env-var>CURR_DATE=${firstNotNull(currentDate, "")}</env-var>
  <env-var>DATA_TABLE=${dataTable}</env-var>
  <file>bin/submit.sh</file>
</shell>
<ok to="end"/>
<error to="kill"/>
</action>

submit.sh file's spark-submit command looks like:

enter code here
CLASS="App class location"
JAR="compiled jar file"

HBASE_JARS="HBase jars"
HBASE_CONF='hbase-site.xml location'

HIVE_JARS="Hive jars"
HIVE_CONF='tez-site.xml location'

HADOOP_CONF='hdfs-site.xml location'

SPARK_BIN_DIR="spark2-client bin directory location"

${SPARK_BIN_DIR}/spark-submit \
  --class ${CLASS} \
  --principal "${PRINCIPAL}" \
  --keytab "${KEYTAB}" \
  --master yarn \
  --deploy-mode cluster \
  --driver-memory 10G \
  --executor-memory 4G \
  --num-executors 10 \
  --conf spark.default.parallelism=24 \
  --jars ${HBASE_JARS},${HIVE_JARS} \
  --files ${HBASE_CONF},${HIVE_CONF},${HADOOP_CONF} \
  --conf spark.ui.port=4042 \
  --conf "spark.executor.extraJavaOptions=-verbose:class - 
  Dsun.security.krb5.debug=true" \
  --conf "spark.driver.extraJavaOptions=-verbose:class - 
  Dsun.security.krb5.debug=true" \
  --queue "${QUEUE_NAME}" \
  ${JAR} \
    --app.name "spark-hbase" \
    --data.table "${DATA_TABLE}" \
    --verbose
  • All I can tell you about Spark+Kerberos+HBase is in https://stackoverflow.com/questions/44265562/spark-on-yarn-secured-hbase - enjoy... – Samson Scharfrichter Nov 26 '18 at 17:28
  • Thanks for the reply, I took a look on this link, but my problem is that the obtainDelegationTokens for HBase via oozie are not working. One more thing, I try to write to HBase via a Hive table backed on HBase. So there is a spark job which writes to Hive table backed to HBase. – Ardian Koltraka Nov 27 '18 at 16:36
  • Oozie cannot create Kerberos tickets for your job, simply because it does not have your password... All it can do is request HDFS/Yarn/Hive/HBase to create **auth tokens** at your name (because `oozie` is a trusted, privileged "proxy" account). Except that Hive & HBase tokens are created **only** if you specify the appropriate `credentials` in your action. Cf. https://stackoverflow.com/questions/33212535/passing-hbase-credentials-in-oozie-java-action – Samson Scharfrichter Nov 27 '18 at 23:08
  • Thank you for the suggestions, i got it working by doing basically two steps: 1) By creating a softlink of hbase-site.xml in /etc/spark2/conf on the host where the Spark job is submitted from: ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml 2) By adding a kinit command in the shell script before the spark-submit command: kinit -kt "${KEYTAB}" "${PRINCIPAL}" – Ardian Koltraka Nov 28 '18 at 16:04

1 Answers1

0

Creating soft link on all the nodes in cluster may not always be feasible. We resolved it by adding hbase configuration directory in spark configuration by overriding SPARK_CONF_DIR environment variable in the shell before the spark-submit command.

export SPARK_CONF_DIR=/etc/spark2/conf:/etc/hbase/conf
Ami Ranjan
  • 26
  • 4