As per the documentation:
Shuffle spill (memory) is the size of the deserialized form of the shuffled data in memory.
Shuffle spill (disk) is the size of the serialized form of the data on disk.
My understanding of shuffle is this:
- Every executor takes all the partitions on it and hashpartitions them into 200 new partitions (this 200 can be changed). Each new partition is associated with an executor that it will later on go to. For example:
For each existing partition: new_partition = hash(partitioning_id)%200; target_executor = new_partition%num_executors
where%
is the modulo operator and the num_executors is the number of executors on the cluster. - These new partitions are dumped onto the disk of each node of their initial executors. Each new partitions will, later on, be read by the target_executor
- Target executors pick up their respective new partitions (out of the 200 generated)
Is my understanding of the shuffle operation correct?
Can you help me put the definition of shuffle spill (memory) and shuffle spill (disk) in the context of the shuffle mechanism (the one described above if it is correct)? For example (maybe): "shuffle spill (disk) is the part that is happening in point 2 mentioned above where the 200 partitions are dumped to the disk of their respective nodes" (I do not know if it is correct to say that; just giving an example)