1

Hi I have installed Apache Spark 1.6.0 and I am trying to persist data to Hive using DataFrame.saveAsTable(). However I am getting errors when it tries to make the /user/hive directory. My understanding was that I automatically got Hive when I installed a binary version of Apache Spark. I also cannot find any of the Hive config files under my $SPARK_HOME directory. To solve this do I need to install Hive separately ?

This is the error I'm getting:

java.io.IOException: Mkdirs failed to create  file:/user/hive/warehouse/wikidata_perm/_temporary/0/_temporary/attempt_201601250849_0002_m_000000_0 
(exists=false,    cwd=file:/home/myuser/devel/sandbox/Learning/Spark/LearningSpark/AmpCampHandsOn)
femibyte
  • 3,317
  • 7
  • 34
  • 59
  • I don't think you do. I guess it's [part of spark sql](https://github.com/apache/spark/tree/master/sql). If that's true, you need to add the spark-sql Maven dependency too, in order to use Hive. – Felipe Jan 26 '16 at 01:25

1 Answers1

1

If you want to have spark support, you have to specify to build spark with Hive and JDBC support. From the linked doc:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver -DskipTests clean package

Side note: Your error does not seem to me as to be caused by lack of hive support. Looks like you are just missing the proper configuration/access rights to the correct directory. See this for help.

Community
  • 1
  • 1
TheMP
  • 8,257
  • 9
  • 44
  • 73
  • Ok, thanks. I went ahead and installed Hive anyway and created the directory with the correct permissions. It works fine now. – femibyte Jan 26 '16 at 15:23