I have to modify the hadoop wordcount example, to count the number of words that start with the prefix "cons" and then need to sort the results in the descending order of their frequency. Can anybody tell how to write the mapper and reducer code for this?
Code:
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>
{
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException
{
//Replacing all digits and punctuation with an empty string
String line = value.toString().replaceAll("\\p{Punct}|\\d", "").toLowerCase();
//Extracting the words
StringTokenizer record = new StringTokenizer(line);
//Emitting each word as a key and one as itsvalue
while (record.hasMoreTokens())
context.write(new Text(record.nextToken()), new IntWritable(1));
}
}