1

I am using Pyspark and want to insert-overwrite partitions into a existing hive table.

  • in this use case saveAsTable() is not suitable, it overwrites the whole existing table
  • insertInto() is behaving strangely: I have 3 partition levels, but it is inserting one

Snd what is the right way to use save()? Can save() take options like database-name and table name to insert into, or only HDFS path?

example :

df\
.write\
.format('orc')\
.mode('overwrite)\
.option('database', db_name)\
.option('table', table_name)\
.save()
mazaneicha
  • 8,794
  • 4
  • 33
  • 52
amine jisung
  • 105
  • 2
  • 14
  • What about [Overwrite specific partitions in spark dataframe write method](https://stackoverflow.com/questions/38487667/overwrite-specific-partitions-in-spark-dataframe-write-method) ? – mazaneicha Mar 10 '22 at 14:28

0 Answers0