I am trying to write a mapreduce program which says to find the occurrence of every tv unit sold. I/P ex- Samsung|Optima|14|Madhya Pradesh|132401|14200 Onida|Lucid|18|Uttar Pradesh|232401|16200 Akai|Decent|16|Kerala|922401|12200 Lava|Attention|20|Assam|454601|24200 Zen|Super|14|Maharashtra|619082|9200
Below is the mapreduce code that I have written- Mapper-
public class TotalUnitMapper extends Mapper<LongWritable,Text,Text,IntWritable> {
Text tvname;
//IntWritable unit;
public void setup(Context context){
tvname = new Text();
// unit = new IntWritable();
}
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException{
String[] lineArray2 = value.toString().split("|");
if(!lineArray2[0].contains("NA") || (!lineArray2[1].contains("NA"))){
tvname.set((lineArray2[0]));
IntWritable unit = new IntWritable(1);
context.write(tvname,unit);
}
}}
Reducer- public class TotalUnitReducer extends Reducer {
public void reduce(Text tvname, Iterable<IntWritable> values, Context context)
throws IOException,InterruptedException{
int sum = 0;
for (IntWritable value : values){
sum+= value.get();
}
context.write(tvname, new IntWritable(sum));
}}
Driver-
public class TotalUnit {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "Assignment 3.3-2");
job.setJarByClass(TotalUnit.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(TotalUnitMapper.class);
job.setReducerClass(TotalUnitReducer.class);
job.setNumReduceTasks(2);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
job.waitForCompletion(true);
}}
However I am getting O/P as this-
A 1
O 4
S 7
L 3
N 1
Z 2
Only the first letters of TV Names are getting printed, I am not sure why. Is something wrong with Split ? Please help as I am a beginner in Hadoop. Thanks in advance.