I have a history file, that consists of multiple (Topic Pages) each page has 14 lines, from which I need to take an ID and then I count the number of appearances of this particular document, that how many times it appears in this history file. But then I need to display the sorted output with highest number of appearances and then by TOPIC PAGE ID.
In mapper I am just taking the TOPIC key, and using IntWritable writing one against every entry.
and Then in Reducer I just sums it up.
Can't do it using secondary sort, as I get the total count for each TOPIC Page after reduce function has been called.
Output would be like:
TopicID Appearances
987634 89
678945 87
378956 76