I've a MapReduce job which I run using job.waitForCompletion(true)
. If one/more reducer task gets killed or crashes during the execution of the job, the entire MapReduce job is restarted and mappers and reducers are executed again (documentation). Here are my questions:
1] Can we know at the start of the job if the job has started fresh or it has restarted because of some failure in the previous run? (This led me to Q2)
2] Can counters help? Does value of counters gets carried over if some tasks fail, which leads to restart of the whole job?
3] Is there any inbuilt checkpointing method provided by Hadoop which keeps track of previous computation and helps avoid doing the same computations done by mappers and reducers before failing/crashing?
Sorry, if the questions are not phrased unclearly. Thanks for the help.