0

I am trying to run wordcount example

Here is the code

   import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

  public static class TokenizerMapper
       extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }

  public static class IntSumReducer
       extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values,
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

I see something like

    15/06/05 02:32:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/06/05 02:32:04 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/06/05 02:32:04 INFO input.FileInputFormat: Total input paths to process : 2
15/06/05 02:32:05 INFO mapreduce.JobSubmitter: number of splits:2
15/06/05 02:32:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433448750700_0003
15/06/05 02:32:05 INFO impl.YarnClientImpl: Submitted application application_1433448750700_0003
15/06/05 02:32:05 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1433448750700_0003/
15/06/05 02:32:05 INFO mapreduce.Job: Running job: job_1433448750700_0003

I came across this post Wordcount program is stuck in hadoop-2.3.0 But solution did not work for me

Community
  • 1
  • 1

2 Answers2

0

You are using Hadoop 2.6 and using mapred package. I think the new package names include mapreduce. Please follow the example here to run correct version. Also you can use link1 and link2 for understanding the differences between the two

Community
  • 1
  • 1
Ramzy
  • 6,948
  • 6
  • 18
  • 30
  • I treid it .Same problem –  Jun 04 '15 at 21:03
  • Are you running in cluster or local workspace. If local, can you add any debug statements and see, if you get any exceptions – Ramzy Jun 04 '15 at 21:06
  • Where to get debug statements? –  Jun 04 '15 at 21:07
  • I mean in java code and also the logs mentioned in the post you referenced – Ramzy Jun 04 '15 at 21:08
  • Even I am not sure, where is it failing, due to absence of exception. One thing I obeserved, is it is suggesting to use Tool Runner interface. This is another way of writing job. Can you try [that](https://github.com/aamend/hadoop-mapreduce/blob/master/sandbox/src/main/java/com/aamend/hadoop/mapreduce/sandbox/ToolImplementation.java) and see if it works. Also, if your mapper is picked up, then it should show smoething like this "mapreduce.Job: map 0% reduce 0% mapreduce.Job: map 100% reduce 0%" – Ramzy Jun 04 '15 at 21:24
  • I will try that,Thank you for your time –  Jun 04 '15 at 21:30
  • Please post the results once resolved and let me know, if the suggestion helped. Happy coding. – Ramzy Jun 04 '15 at 21:37
  • Same problem,I think the problem is not with the code rather something is wrong with configuration –  Jun 05 '15 at 17:52
0

Finally, I solved this issue. I needed to add the following property in yarn-site.xml

 <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>Hostname-of-your-RM</value>
        <description>The hostname of the RM.</description>
    </property>

This should solve your issue.

Shash
  • 452
  • 8
  • 25