2

I am trying to monitor a Hadoop cluster with Nagios. My goal is to monitor the status and resource usage of all the Hadoop daemons such as DataNode, Jobtracker and Tasktracker etc. What I can think of the solution is to monitor the ports that these daemons are using. But this seems very limited. For example, I can't see how many tasks are running in the node etc.

So, my question is: is there systematic solution for Hadoop monitoring using Nagios?

Thanks,

Shumin

Shumin Guo
  • 184
  • 1
  • 3
  • 11

3 Answers3

0

I've found this. It's a nagios plugin to monitor hdfs. Here are all hadoop-related plugins at nagios exchange.

Marcel
  • 1,266
  • 7
  • 18
0

There are certainly ways to monitor a Hadoop Cluster with SNMP. You should install the package snmp on your Linux server. Also SNMP has to be enabled on the cluster, I guess there's an option to enable this in some sort of webbased management console.

When you enabled this, you should be able to snmpwalk the cluster :

snmpwalk -v 2c -c public <ip address cluster>

.. than you can write a perl or bash script to check certain OIDs which you prefer to monitor. You can place this script in your 'libexec' folder and define a new command in commands.cfg for this script like check_cluster_snmp or something you like.

You can also check the cluster with JMX but I don't know a lot about JMX yet.

Horaasje
  • 106
  • 10
0

Your best bet is with JMX, as it allows a view into the Java processes to check what's going on, as well as provide metrics (like blacklisted nodes, hdfs space status, etc.).

You can pull data from the each node via the URL http://node.domain:port/jmx?qry=*adoop

You can take a look at these questions which are related:

https://stackoverflow.com/questions/16893407/are-there-advanced-http-query-parameters-for-jmx-proxy-tomcat-servlet

Is there any JMX - REST bridge available?

Community
  • 1
  • 1
Thomas BDX
  • 2,632
  • 2
  • 27
  • 31