6

I am using the below code to create a table from a dataframe in databricks and run into error.

df.write.saveAsTable("newtable")

This works fine the very first time but for re-usability if I were to rewrite like below

df.write.mode(SaveMode.Overwrite).saveAsTable("newtable")

I get the following error.

Error Message:

org.apache.spark.sql.AnalysisException: Can not create the managed table newtable. The associated location dbfs:/user/hive/warehouse/newtable already exists
paone
  • 828
  • 8
  • 18

2 Answers2

5

The SQL config 'spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation' was removed in the version 3.0.0. It was removed to prevent loosing of users data for non-default value.

hopefulnick
  • 61
  • 1
  • 3
3

Run following command to fix issue :

     dbutils.fs.rm("dbfs:/user/hive/warehouse/newtable/", true)

Or

Set the flag

spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation = true

spark.conf.set("spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation","true")

Chema
  • 2,748
  • 2
  • 13
  • 24
vaquar khan
  • 10,864
  • 5
  • 72
  • 96
  • Hi @vaquar khan, Thank you for the response. I still get error and not sure why can't I use saveAsTable with the overwrite. Also, databricks documentation suggests against using dbutils.fs.rm on large datasets.(https://kb.databricks.com/data/list-delete-files-faster.html) – paone Sep 10 '20 at 21:47
  • 1
    Try this. Set the flag spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation to true – vaquar khan Sep 10 '20 at 23:26