Cannot run word count on hadoop.thanks

Question

I tried to run hadoop word count in eclipse. but there is something wrong with it; it can't even be debugged.

package test;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import test.test.Map2.Combine;

public class test {


public static class Map2 extends Mapper<LongWritable, Text, Text, Text> {
  public void map(LongWritable key, Text value, Context context) 
            throws IOException, InterruptedException {
        String line = value.toString();
       String values=line.split(" ")[0]+"\\|"+line.split(" ")[1];
        context.write(new Text(" "),new Text(values));
        }
    //method reduce start
    public static class Combine extends Reducer<Text, Text, Text, IntWritable> {
    ArrayList<String> top5array= new ArrayList<String>();
    public void reduce(Text key, Iterable<Text> values, Context context) 
      throws IOException, InterruptedException {

        //arraylist    
        while(top5array.get(4)==null)
        {
            top5array.add(values.iterator().next().toString());
        }


     while(values.iterator().hasNext())
     {
       String  currentValues=values.iterator().next().toString();
       String currentkey=currentValues.split("\\|")[0];
       Integer currentnum=Integer.parseInt(currentValues.split("\\|")[1]);

      for(int i=0;i<5;i++)
         {
         Integer numofArray = Integer.parseInt(top5array.get(i).split("\\|")[1]);
             if(top5array.get(i) != null && currentnum < numofArray)
                {
                 break;
                 }
                 if(i == 4)
                         {
                         String currentKeyValuePair = currentkey + currentnum.toString();
                         top5array.add(5, currentKeyValuePair);
                         Collections.sort(top5array);
                         top5array.remove(0);
                         }
          }// for  end
      }// while end
    }//method reduce end
 } // Combine end
 }
 // map end

 public static class Reduce2 extends Reducer<Text, Text, Text, Text> {
 ArrayList<String> top5array= new ArrayList<String>();
    public void reduce(Text key, Iterable<Text> values, Context context) 
      throws IOException, InterruptedException {


        while(top5array.get(4)==null)
        {
            top5array.add(values.iterator().next().toString());
        }


     while(values.iterator().hasNext())
     {
       String  currentValues=values.iterator().next().toString();
       String currentkey=currentValues.split("\\|")[0];
       Integer currentnum=Integer.parseInt(currentValues.split("\\|")[1]);

      for(int i=0;i<5;i++)
         {
         Integer numofArray = Integer.parseInt(top5array.get(i).split("\\|")[1]);
             if(top5array.get(i) != null && currentnum < numofArray)
                {
                 break;
                 }
                 if(i == 4)
                         {
                         String currentKeyValuePair = currentkey + currentnum.toString();
                         top5array.add(5, currentKeyValuePair);
                         Collections.sort(top5array);
                         top5array.remove(0);
                         }
          }
      }
        String top5StringConca = ""; 
        for(int i=0; i < 5; i++){
        top5StringConca = top5StringConca + top5array.get(i);
        }
        context.write(new Text(" "), new Text(top5StringConca));
    }
 } 



 //the second of mapreduce  end


 public static void main(String[] args) throws Exception {
   Configuration conf = new Configuration();
   Job job = Job.getInstance(conf);
   job.setOutputKeyClass(Text.class);
   job.setOutputValueClass(Text.class);
   job.setMapperClass(Map2.class);
   job.setReducerClass(Reduce2.class);
   job.setCombinerClass(Combine.class);
   job.setInputFormatClass(TextInputFormat.class);
   job.setOutputFormatClass(TextOutputFormat.class);

   FileInputFormat.addInputPath(job, new Path(args[0]));
   FileOutputFormat.setOutputPath(job, new Path(args[1]));
   job.waitForCompletion(true);

 }

}

The problem when running it shows the following exception:

WARN  [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62))
 - Unable to load native-hadoop library for your platform using builtin-java classes where applicable`

How can I solve the problem?

take a look at this http://stackoverflow.com/questions/19943766/hadoop-unable-to-load-native-hadoop-library-for-your-platform-error-on-centos — Nakul91, Jul 09 '15 at 09:19

Nakul91 · Answer 1 · 2015-07-09T09:50:31.840

0

Add your hadoop jar in your project.
If you already configured the hadoop then you can point your hdfs inside eclipse. For that you have include the dependencies.

Add hadoop dependencies (if you are using maven) inside your pom.xml. Also add thrird party plugin for eclipse. Here is the guide. These will enable your Map-Reduce Perspective in eclipse. I have added following dependencies inside my project :

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>1.2.1</version>
    <scope>compile</scope>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.6.0</version>
</dependency>

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.6.0</version>
</dependency>

You will see the dependency itself contains the hadoop jar. It will now depends if you want to use the existing config or default config provided by the jar.
Now try to run your hadoop driver class. You can easily debug the code in eclipse. Also now your hadoop perspective is enabled. You can add your hdfs path here.

You can also check this for remote debugging.

edited Jul 09 '15 at 09:50

answered Jul 09 '15 at 09:34

Nakul91

1,245
13
30

thanks for your reply .but i can't find the pom.xml ?Can you tell me where it is please? – qianda66 Jul 09 '15 at 10:43
You are not using maven based project then. Search for the jar files using `artifact-Id` name download the jars and include into your project. for e.g. `hadoop-client.jar` – Nakul91 Jul 09 '15 at 10:47
How? Will you post the steps so that it will be useful for future! – Nakul91 Jul 09 '15 at 16:26
My mistake is not in the position you tips, it's just my Mapreduce wrong – qianda66 Jul 10 '15 at 05:50
mistakes in my Mapper. " "modified "\t" then ,i can run the Mapper ,but now ,i also can't run the combiner and reducer! – qianda66 Jul 10 '15 at 06:23

Cannot run word count on hadoop.thanks

1 Answers1