2

I am using Spark 2.2 and I'm trying to create a Hive table based on a dataframe.

I was able to create a new Hive table with data using only:

result.write.mode(SaveMode.Overwrite).saveAsTable("db.resultTable")

When I try to do the same with partitions:

result.write.mode(SaveMode.Overwrite).partitionBy("year", "month", "day").saveAsTable("db.resultTable")

I always get the error:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Operation not allowed: ALTER TABLE RECOVER PARTITIONS only works on table with location provided: `db`.`resultTable`;

Note: Altough the error, it created a table with the correct columns. It also created partitions and the table has a location with Parquet files in it (/user/hive/warehouse/db.db/resultTable/year=2017/month=1/day=1). But it contains no data.

I tried looking for some answers but haven't find any yet. According to this thread, I did everything fine. (I also set the hive.exec.dynamic.partition and hive.exec.dynamic.partition.mode)

Does anybody know what I'm missing or doing wrong?

RudyVerboven
  • 1,204
  • 1
  • 14
  • 31

1 Answers1

1

Don't save it as a table but save it as files in the HDFS directory.

result.write.mode(SaveMode.Overwrite).partitionBy("year", "month", "day").parquet("/path/to/table")
justcode
  • 108
  • 6