In our project, we are using repartition(1)
to write data into table, I am interested to know why coalesce(1)
cannot be used here because repartition
is a costly operation compared to coalesce
.
I know repartition
distributes data evenly across partitions, but when the output file is of single part file, why can't we use coalesce(1)
?