I have a data frame df, I want to partition it by date (a column in the df). I have the code below:
df.write.partitionBy('date').mode(overwrite').orc('path')
Then under the path above, there are bunch folders, e.g. date=2018-10-08 etc... But under the folder date=2018-10-08, there are 5 files, what I want is to reduce to only one file inside the date=2018-10-08 folder. How to do that? I still want it partitioned by date.
Thank you in advance!