Hadoop Parameter explanation

Question

Hadoo-2.6 has following parameters as given in the documentation

mapreduce.job.max.split.locations (The max number of block locations to store for each split for locality calculation. How does it use this in locality calculation?)
mapreduce.job.split.metainfo.maxsize (The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. But what is the advantage of fixing it to some value ? why we can not make it flexible?)
mapreduce.job.counters.limit (what are these user counters per job and why do we want to put limit on them?)
mapreduce.jobhistory.datestring.cache.size (Size of the date string cache. Effects the number of directories which will be scanned to find a job. What is the advantage of putting this limit?)
mapreduce.jobhistory.joblist.cache.size (Size of the job list cache. why do we use this cache?)
mapreduce.jobhistory.loadedjobs.cache.size (what is difference between this and previous one?)
mapreduce.jobhistory.move.thread-count (The number of threads used to move files. are they only used to move history files and why is this movement required?)

Are they applicable for both MRv1 & MRv2 style job execution?

0 Answers0