0

My question is related to the existing thread

But we are on HDP 2.6.3 and Ambari 2.6.1.5

Question: We are trying to access the HIVE tables data from SPARK2.2

The command:

spark-submit --class com.virtuslab.sparksql.MainClass  --master yarn --deploy-mode client /tmp/spark-hive-test/spark_sql_under_the_hood-spark2.2.0.jar

In the client mode it works --> please note we haven't passed the --files or --conf spark.yarn.dist.files

spark-submit --class com.virtuslab.sparksql.MainClass  --master yarn --deploy-mode cluster /tmp/spark-hive-test/spark_sql_under_the_hood-spark2.2.0.jar

In the cluster mode it fails with:

diagnostics: User class threw exception: 
org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 
'xyz' not found in database 'qwerty';
     ApplicationMaster host: 121.121.121.121
     ApplicationMaster RPC port: 0
     queue: default
     start time: 1523616607943
     final status: FAILED
     tracking URL: https://managenode002xxserver:8090/proxy/application_1523374609937_10224/
     user: abc123
Exception in thread "main" org.apache.spark.SparkException: Application 
application_1523374609937_10224 finished with failed status
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1187)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1233)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:782)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 

Please note we have not used --files or --conf spark.yarn.dist.files

But the same works with this:

spark-submit --class com.virtuslab.sparksql.MainClass  --master yarn --deploy-mode cluster --files /etc/spark2/conf/hive-site.xml /tmp/spark-hive-test/spark_sql_under_the_hood-spark2.2.0.jar

And the result is seen

Is there any BUG that it is not allowing SPARK not to pick up /etc/spark2/conf while run in YARN CLUSTER mode.

Note: The /etc/spark2/conf contains hive-site.xml on all the nodes of the cluster.

  • The bug is in the HDP documentation. You found my post, so how is this different? Did you try the answer there? – OneCricketeer Apr 18 '18 at 12:15
  • @cricket_007 as mentioned the conf fodler /etc/spark2/conf is present is in all the nodes (edge/master/nodemanager nodes). any specific reason it is not picking up and we do have the Spark2 clients on the resoucemanager nodes and we use the Ambari to distribute the config files – Venkata Sudheer Kumar M Apr 18 '18 at 13:11
  • You're question is exactly the same as mine before... I understand what you're saying, and the Hortonworks documentation says it should work that way as well, but it clearly does not. – OneCricketeer Apr 18 '18 at 13:20
  • Also, see the MapR document specifically saying to edit the dist files property. https://maprdocs.mapr.com/home/Spark/IntegrateSparkSQL_Hive.html – OneCricketeer Apr 18 '18 at 13:23
  • Also, I did find a JIRA issue in Spark previously where they mentioned they removed the `conf/hive-site` from being read. I don't know the ticket number now or why it was done. – OneCricketeer Apr 18 '18 at 13:27
  • @cricket_007 thanks a lot for the details, i will check with HDP team. – Venkata Sudheer Kumar M Apr 18 '18 at 13:42
  • @cricket_007 is this the SPARK BUG you are referring to: https://issues.apache.org/jira/browse/SPARK-22463 – Venkata Sudheer Kumar M Apr 18 '18 at 16:22
  • Sure, they labelled that as a Bug, but no. There was a separate ticket around the 2.0 releases that they commented on actually removing that file from being read – OneCricketeer Apr 18 '18 at 19:53

0 Answers0