0

I'm unable to find a JobClient (Java, MRv1) equivalent for MRv2. I'm trying to read MR job status, counters etc for a running job. I'd have to get the information from he resource manager I believe (since the History server wouldn't have the information before the job ends and I need to read counters while the job is still running). Is there a client in the mapreduce api that I'm missing?

Praneeth
  • 309
  • 4
  • 14

1 Answers1

0

If you have the application ID of the MR job that you submitted to YARN, then you can use:

  • YarnClient (import org.apache.hadoop.yarn.client.api.YarnClient) and
  • ApplicationReport (import org.apache.hadoop.yarn.api.records.ApplicationReport)

to get application related statistics.

For e.g. sample code is below:

// Initialize and start YARN client
YarnClient yarnClient = YarnClient.createYarnClient();
yarnClient.init(configuration);
yarnClient.start();

// Get application report
try {
    ApplicationReport applicationReport = yarnClient.getApplicationReport(ConverterUtils.toApplicationId(applicationID));
    // Get whatever you want here.
} catch (Exception e) {
    // Handle exception;
}

// Stop YARN client
yarnClient.stop();

Some of the information you can get from the ApplicationReport class is:

  1. Application resource usage report

  2. Application dianostics

  3. Final application status

  4. Start and finish time

  5. Application type

  6. Priority

  7. Progress etc.

You can check the API documentation for YarnClient and ApplicationReport here (this is Hadoop 2.7 documentation):

Manjunath Ballur
  • 6,287
  • 3
  • 37
  • 48
  • Thanks for the informative response. I do not have the application id since I'm checking the state in one of the tasks (the reducer). Apparently, the old JobClient still works with the new MR2 api and makes the appropriate requests when given the MR job id. – Praneeth Jul 19 '16 at 20:50
  • But isn't the ApplicationReport Yarn specific though. Don't I have to get the MR AM to get the MR counters? – Praneeth Jul 19 '16 at 20:58