I am learning the MapReduce framework and have the following questions about the same:
The MapReduce paradigm essentially has
map()
andreduce()
(and a few others as well). Can all the programming logic be effectively represented as either amap()
or areduce()
?For e.g., suppose I want to do an in-order traversal of a tree. Can this task be effectively divided only into a
map()
andreduce()
task? If yes, how? If no, then how do I leverage theMapReduce
framework for this task?
Generic code for in-order traversal:
// Iterative solution
public void inOrderIter(TreeNode root) {
if(root == null)
return;
Stack<TreeNode> s = new Stack<TreeNode>();
TreeNode currentNode=root;
while(!s.empty() || currentNode!=null){
if(currentNode!=null)
{
s.push(currentNode);
currentNode=currentNode.left;
}
else
{
TreeNode n=s.pop();
System.out.printf("%d ",n.data);
currentNode=n.right;
}
}
}
Can we have only a
map()
without a correspondingreduce()
and vice versa?As per this and this, the
reduce()
function generates the final output - is it necessary that it should generate only a single value?How do you decide if a certain task should be a part of
map()
orreduce()
?Any general pointers about how to
map
-ify andreduc
-ify a given task?