How does Repartition or Coalesce work internally?
For Repartition() is the data being collected on Drive node and then shuffled across the executors?
Is Coalesce a Narrow/wide transformation?
How does Repartition or Coalesce work internally?
For Repartition() is the data being collected on Drive node and then shuffled across the executors?
Is Coalesce a Narrow/wide transformation?
Hope this answer is helpful - Spark - repartition() vs coalesce()
Do read the answer by Powers and Justin