I have spark conf as:
sparkConf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")
sparkConf.set("hive.exec.dynamic.partition", "true")
sparkConf.set("hive.exec.dynamic.partition.mode", "nonstrict")
I am using the spark context to write the parquet files into hdfs location as:
df.write.partitionBy('asofdate').mode('append').parquet('parquet_path')
In hdfs location, the parquet files are stored as 'asofdate' but in hive table I have to do 'MSCK REPAIR TABLE <tbl_name>' everyday. I am looking for a solution on how I can do recover table for every new partitions using spark script (or at the time of partition creation itself).