What you are referring to is classified as failure of task
which could be either a map task
or reducer task
In case of a particular task
failure, Hadoop initiates another computational resource in order to perform failed map or reduce tasks.
When it comes to failure of shuffle and sort
process, it is basically a failure in the particular node where reducer task
has failed and it would be set to run afresh in another resource (btw, reducer phase begin with shuffle and sort process).
Of course it would not allocate the tasks infinitely if they keep failing. There are two properties below which can determine how many failures or attempts of a task could be acceptable.
mapred.map.max.attempts
for Map tasks and a property mapred.reduce.max.attempts
for reduce tasks.
By default, if any task fails four times (or whatever you configure in those properties), the whole job would be considered as failed. - Hadoop Definitive Guide
In short shuffle and sort
being a part of reducer, it would only initiate attempt to rerun reducer task. Map tasks would not be re-run as they are considered as completed.
Is there any stage in MapReduce where output is stored in the HDFS, so that computation can restart only from there?
Only the final output would be stored in HDFS. Map's outputs are classified as intermediate and would not be stored in HDFS as HDFS would replicate the data stored and basically why would you want HDFS to manage intermediate data that's of no use after the completion of job. There would be additional overhead of cleaning it up as well. Hence Maps output are not stored in HDFS.
And what about a Map after Map-Reduce. Is output of reduce stored in HDFS?
The output of reducer would be stored in HDFS. For Map, I hope the above description would suffice.