0

I have a history file, that consists of multiple (Topic Pages) each page has 14 lines, from which I need to take an ID and then I count the number of appearances of this particular document, that how many times it appears in this history file. But then I need to display the sorted output with highest number of appearances and then by TOPIC PAGE ID.

In mapper I am just taking the TOPIC key, and using IntWritable writing one against every entry.

and Then in Reducer I just sums it up.

Can't do it using secondary sort, as I get the total count for each TOPIC Page after reduce function has been called.

Output would be like:

TopicID Appearances
987634 89
678945 87
378956 76

Asad
  • 1
  • 5

0 Answers0