1

I am loading Data from one Hive table to another using spark Sql. I've created sparksession with enableHiveSupport and I'm able to create table in hive using sparksql, but when I'm loading data from one hive table to another hive table using sparksql I'm getting permission issue:

Permission denied: user=anonymous,access=WRITE, path="hivepath".

I am running this using spark user but not able to understand why its taking anonymous as user instead of spark. Can anyone suggest how should I resolve this issue?

I'm using below code.

    sparksession.sql("insert overwrite into table dbname.tablename" select * from dbname.tablename").
RamblinRose
  • 4,883
  • 2
  • 21
  • 33
Up Ap
  • 107
  • 2
  • 12

3 Answers3

0

If you're using spark, you need to set username in your spark context.

  System.setProperty("HADOOP_USER_NAME","newUserName")
  val spark = SparkSession
    .builder()
    .appName("SparkSessionApp")
    .master("local[*]")
    .getOrCreate()

  println(spark.sparkContext.sparkUser)
Eric Yang
  • 2,678
  • 1
  • 12
  • 18
  • when I'm Executing above command am getting username as spark but while executing sparksession.sql("insert overwrite into table dbname.tablename" select * from dbname.tablename"). getting same anonymous user. – Up Ap Mar 17 '20 at 12:31
  • You're going to need to paste all of your code and not just that one command – Eric Yang Mar 17 '20 at 12:55
0

To validate with which user you are running, run below command: -

    sc.sparkUser

It will show you the current user and then you can try setting new user as per the below code

And in scala, you can set the username by

    System.setProperty("HADOOP_USER_NAME","newUserName")
user3190018
  • 890
  • 13
  • 26
shashank
  • 50
  • 4
0

First thing is you may try this for ananymous user

root@host:~# su - hdfs
hdfs@host:~$ hadoop fs -mkdir /user/anonymous
hdfs@host:~$ hadoop fs -chown anonymous /user/anonymous

In general

export HADOOP_USER_NAME=youruser before spark-submit will work. along with spark-submit configuration like below.

--conf "spark.yarn.appMasterEnv.HADOOP_USER_NAME=${HADDOP_USER_NAME}" \

alternatively you can try using sudo -su username spark-submit --class your class

see this

Note : This user name setting should be part of your initial cluster setup ideally if its done then no need to do all these above and its seemless.

I personally dont prefer user name hard coding in the code it should be from outside the spark job.

Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121