4

I am working on a spark program that will load the data into Hive tables and I am doing that on Spark Version: 2.0.2 Initially I did executed these two steps on spark-shell:

import org.apache.spark.sql.SparkSession
val spark = val spark = SparkSession.builder.master("local").appName("SparkHive").enableHiveSupport().config("hive.exec.dynamic.partition","true").config("hive.exec.dynamic.partition.mode","nonstrict").config("hive.metastore.warehouse.dir","/user/hive/warehouse").getOrCreate()

When I tried to load a dataset from HDFS into spark, I was getting an exception for the below line:

val partFile = spark.read.textFile("hdfs://quickstart:8020/user/cloudera/partfile")

Exception:

The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx---

After doing some research online, I learnt that only HDFS has the permission to work on the /tmp/hive directory and not even the root user can operate on it. I tried the below command and as predicted, it didn't work.

hadoop fs -chmod -R 777 /tmp/hive/

But I performed these steps and I was able to load the file into Spark.

  1. In hdfs, Remove the /tmp/hive directory ==> "hdfs dfs -rm -r /tmp/hive"

  2. At OS level too, delete the dir /tmp/hive ==> rm -rf /tmp/hive

My question is, could anyone tell me if it is this the right way to fix the issue ?

Metadata
  • 2,127
  • 9
  • 56
  • 127
  • 2
    Possible duplicate of [Why does spark-shell fail with "The root scratch dir: /tmp/hive on HDFS should be writable."?](https://stackoverflow.com/questions/44644206/why-does-spark-shell-fail-with-the-root-scratch-dir-tmp-hive-on-hdfs-should-b) – Jacek Laskowski Jul 06 '17 at 07:22

0 Answers0