6

I am working on AWS EMR.

I want to get the information of died task node as soon as possible. But as per default setting in hadoop, heartbeat is shared after every 10 minutes.

This is the default key-value pair in mapred-default - mapreduce.jobtracker.expire.trackers.interval : 600000ms

I tried to modify default value to 6000ms using - this link

After that, whenever I terminate any ec2 machine from EMR cluster, I am not able to see state change that fast.(in 6 seconds)

Resource manager REST API - http://MASTER_DNS_NAME:8088/ws/v1/cluster/nodes

Questions-

  1. What is the command to check the mapreduce.jobtracker.expire.trackers.interval value in running EMR cluster(Hadoop cluster)?
  2. Is this the right key I am using to get the state change ? If it is not, please suggest any other solution.
  3. What is the difference between DECOMMISSIONING vs DECOMMISSIONED vs LOST state of nodes in Resource manager UI ?

Update

I tried numbers of times, but it is showing ambiguous behaviour. Sometimes, it moved to DECOMMISSIONING/DECOMMISIONED state, and sometime it directly move to LOST state after 10 minutes.

I need a quick state change, so that I can trigger some event.

Here is my sample code -

List<Configuration> configurations = new ArrayList<Configuration>();

        Configuration mapredSiteConfiguration = new Configuration();
        mapredSiteConfiguration.setClassification("mapred-site");
        Map<String, String> mapredSiteConfigurationMapper = new HashMap<String, String>();
        mapredSiteConfigurationMapper.put("mapreduce.jobtracker.expire.trackers.interval", "7000");
        mapredSiteConfiguration.setProperties(mapredSiteConfigurationMapper);

        Configuration hdfsSiteConfiguration = new Configuration();
        hdfsSiteConfiguration.setClassification("hdfs-site");
        Map<String, String> hdfsSiteConfigurationMapper = new HashMap<String, String>();
        hdfsSiteConfigurationMapper.put("dfs.namenode.decommission.interval", "10");
        hdfsSiteConfiguration.setProperties(hdfsSiteConfigurationMapper);

        Configuration yarnSiteConfiguration = new Configuration();
        yarnSiteConfiguration.setClassification("yarn-site");
        Map<String, String> yarnSiteConfigurationMapper = new HashMap<String, String>();
        yarnSiteConfigurationMapper.put("yarn.resourcemanager.nodemanagers.heartbeat-interval-ms", "5000");
        yarnSiteConfiguration.setProperties(yarnSiteConfigurationMapper);

        configurations.add(mapredSiteConfiguration);
        configurations.add(hdfsSiteConfiguration);
        configurations.add(yarnSiteConfiguration);

This is the settings that I changed into AWS EMR (internally Hadoop) to reduce the time between state change from RUNNING to other state(DECOMMISSIONING/DECOMMISIONED/LOST).

devsda
  • 4,112
  • 9
  • 50
  • 87

2 Answers2

4
  1. You can use "hdfs getconf". Please refer to this post Get a yarn configuration from commandline

  2. These links give info about node manager health-check and the properties you have to check:

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html

https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManager.html

Refer "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in the below link:

https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

  1. Your queries are answered in this link:

    https://issues.apache.org/jira/browse/YARN-914

Refer the "attachments" and "sub-tasks" area. In simple terms, if the currently running application master and task containers gets shut-down properly (and/or re-initiated in different other nodes) then the node manager is said to be DECOMMISSIONED (gracefully), else it is LOST.

Update:

"dfs.namenode.decommission.interval" is for HDFS data node decommissioning, it does not matter if you are concerned only about node manager. In exceptional cases, data node need not be a compute node.

Try yarn.nm.liveness-monitor.expiry-interval-ms (default 600000 - that is why you reported that the state changed to LOST in 10 minutes, set it to a smaller value as you require) instead of mapreduce.jobtracker.expire.trackers.interval.

You have set "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" as 5000, which means, the heartbeat goes to resource manager once in 5 seconds, whereas the default is 1000. Set it to a smaller value as you require.

Community
  • 1
  • 1
Marco99
  • 1,639
  • 1
  • 19
  • 32
  • Thanks for your updated answer. I will update you today, the results of this configuration. – devsda Aug 20 '16 at 08:42
  • Validating your answer. Will respond you in 1 hour. – devsda Aug 21 '16 at 10:01
  • `yarn.resourcemanager.nodemanagers.heartbeat-interval-ms` is anyway 5seconds. Its ok with this value also. Basically, my task is, whenever any node goes down, just take immediate action (action - add one more node). This way, I can maintain job SLA. – devsda Aug 21 '16 at 11:33
  • @Macro99 I checked for `yarn.nm.liveness-monitor.expiry-interval-ms` as 5000 means 5 seconds. The moment I shutdown the EC2, resource manager changed its status to LOST. But, is there any consequences for this on hadoop cluster ? – devsda Aug 21 '16 at 11:53
  • @devsda : Yes, the loss of a node manager drill downs to the loss of all the tasks and the application masters running there. These lost tasks and application masters have to be accommodated in other node manager(s). These will be taken care by the execution framework itself. Also, there is a node manager restart feature as described here: https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html – Marco99 Aug 21 '16 at 17:11
  • Thanks for explaining. But my question was, is it recommended to make `yarn.nm.liveness-monitor.expiry-interval-ms` to smaller values like 30 seconds ? Is there any consequences on cluster ? – devsda Aug 22 '16 at 09:33
  • 1
    Generally, too small values are not recommended because, even brief network interruptions can make the NMs unavailable for the cluster and increase load on other NMs. The effect may get cascaded and might bring down the entire compute nodes. IMHO, hit and trial method will help in deciding the optimum value for the cluster. However, this has to be simulated by studying the behaviour with proper load on a smaller cluster. – Marco99 Aug 22 '16 at 11:50
  • @Macro99 May I know why you just got 50 points ? Because I put 100 points for bounty. – devsda Aug 29 '16 at 06:19
  • @devsda : no idea :) – Marco99 Aug 29 '16 at 06:34
  • @devsda : It required answer acceptance before the grace period ends. :) http://stackoverflow.com/help/bounty – Marco99 Aug 30 '16 at 14:44
0
  1. hdfs getconf -confKey mapreduce.jobtracker.expire.trackers.interval

  2. As mentioned in the other answer: yarn.resourcemanager.nodemanagers.heartbeat-interval-ms should be set based on your network, if your network has high latency, you should set a bigger value.

3. Its in DECOMMISSIONING when there are running containers and its waiting for them to complete so that those nodes can be decommissioned.

Its in LOST when its stuck in this process for too long. This state is reached after the set timeout is passed and decommissioning of node(s) couldn't be completed.

DECOMMISSIONED is when the decommissioning of the node(s) completes.

Reference : Resize a Running Cluster

For YARN NodeManager decommissioning, you can manually adjust the time a node waits for decommissioning by setting yarn.resourcemanager.decommissioning.timeout inside /etc/hadoop/conf/yarn-site.xml; this setting is dynamically propagated.

Ani Menon
  • 27,209
  • 16
  • 105
  • 126