I think the term shuffle refers to randomly reordering elements in a sequence [1]. Therefore, the first time I saw shuffling in MapReduce, I thought it's trying to uniformly distribute workload to nodes for load balancing purpose. However, after reading the details, I realized that it's not what I thought it is. It's not random and is more like group by
in SQL.
So what's the motivation behind using the term shuffling? Since I'm new to MapReduce, it's most likely that I simply have missed something. I'm all ears.