Cannot access counter in the reducer class of MapReduce

Question

I'm incrementing a counter from the mappers in the following way

public static class TokenizerMapper
extends Mapper<Object, Text, Text, FloatWritable>{  
public static enum MyCounters { TOTAL };  
context.getCounter(MyCounters.TOTAL).increment(1);

.

I'm trying to get the value of this counter in the reducer class in the following way.

@Override
public void setup(Context context) throws IOException     ,InterruptedException{  
Configuration conf = context.getConfiguration();  
Cluster cluster = new Cluster(conf);  
Job currentJob = cluster.getJob(context.getJobID());  
Counters counters = currentJob.getCounters();  
Counter counter =          counters.findCounter(TokenizerMapper.MyCounters.TOTAL);

But when I run the code ,
it always gives a

java.lang.NullPointerException at the last line
cluster.getJob(context.getJobID())

which always returns null.

I've tried other ways to access a counter incremented within the mapper in the reducer , but with no success .

Can someone please explain to me what the problem is exactly and how can I access the counters from the reducer . I need the value of the total count to calculate the percentage of the words.

This is my driver code.

Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(FloatWritable.class);
job.setNumReduceTasks(1);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);

U shouldn't create a new cluster to access the same counter. Can u tell me the reason why you are doing this? You can directly get the job instance using the context. — Tanveer Dayan, Dec 09 '15 at 05:58
To get the job instance . I'm not really sure how to get the job instance from context . I'll try to figure that out — Sumit Das, Dec 09 '15 at 06:40
I hope you checked this: http://stackoverflow.com/questions/5450290/accessing-a-mappers-counter-from-a-reducer — Manjunath Ballur, Dec 10 '15 at 06:29
Yes , I'd checked that . The issue is that the job returned by getJob() is always null. I've not been able to figure out why — Sumit Das, Dec 10 '15 at 07:53

Manjunath Ballur · Answer 1 · 2015-12-21T10:51:36.220

I am using Hadoop 2.7.0.

You don't need to instantiate the cluster, to access the counters in the reducer.

In my mapper, I have following code:

// Define a enum in the Mapper class 
enum CustomCounter {Total};

// In the map() method, increment the counter for each record
context.getCounter(CustomCounter.Total).increment(1);

In my reducer, I access the counter as below:

Counter counter = context.getCounter(CustomCounter.Total);

It works perfectly for me.

Following are my maven dependencies:

<dependencies>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.7.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-mapreduce-client-core</artifactId>
        <version>2.7.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>1.2.1</version>
    </dependency>
</dependencies>

score 0 · Answer 2 · answered Dec 09 '15 at 07:26

0

You should use JobClient.getJob(conf).getcounters() in your code instead of creating a new Cluster.

answered Dec 09 '15 at 07:26

Tanveer Dayan

496
1
7
18

How can I construct the JobClient object ? Which constructor should I call? – Sumit Das Dec 09 '15 at 07:36
JobClient jc=new JobClient(conf); jc.getJob(jobid); – Tanveer Dayan Dec 09 '15 at 09:18
`public void setup(Context context) throws IOException , InterruptedException{ JobClient client = new JobClient(context.getConfiguration()); if (client.getJob(context.getConfiguration()) == null) { System.out.println("job is null"); }` – Sumit Das Dec 09 '15 at 17:10
The above line always prints "job is null " . Is conf a JobConf object ? I'm using hadoop 2.6 . I don't think JobConf is passed as a parameter – Sumit Das Dec 09 '15 at 17:12

Cannot access counter in the reducer class of MapReduce

2 Answers2