With Adaptive Query Execution in Spark 3+ , can we say that, we don't need to set spark.sql.shuffle.partitions explicitly at different stages in the application ? Given that, we have set
spark.sql.adaptive.coalescePartitions.initialPartitionNum
As the Spark documentation says, that dynamic coalesce will be able to decide the number of partitions automatically.
https://spark.apache.org/docs/latest/sql-performance-tuning.html#coalescing-post-shuffle-partitions
In my understanding, spark.sql.shuffle.partitions is the no of partitions used in the shuffle process and not for determining the number of partitions in the resulting dataframe, the latter will be decided based on default parallelism, coalesce and repartition. In this context, documentation confuses me a little , it says, it will automatically coalesce post shuffle process and decide the number of partitions, and you do not need to set a proper shuffle partition number to fit your dataset.