0

I have a collection with data as follows :

    { "_id" : "279771168740729_161573583988659_462046", "user_likes" : false, "message" : "good morning ICICI Bank have a great day...waiting for today surprise", "like_count" : 0, "message_tags" : [ { "id" : "279771168740729", "name" : "ICICI Bank", "length" : 10, "offset" : 13, "type" : "page" } ], "page_username" : "icicibank", "page_id" : "279771168740729", "can_remove" : false, "from" : { "id" : "100002801855936", "name" : "Kowshik Krankz" }, "page_name" : "ICICI Bank", "post_id" : "279771168740729_161573583988659", "created_time" : "2012-11-03T04:10:31+0000" }
    { "_id" : "279771168740729_203743029752972", "icon" : "http://static.ak.fbcdn.net/rsrc.php/v2/yj/r/v2OnaTyTQZE.gif", "link" : "http://youtu.be/eKxIbLVRHRE", "page_username" : "icicibank", "caption" : "www.youtube.com", "from" : { "id" : "279771168740729", "category" : "Bank/financial institution", "name" : "ICICI Bank" }, "type" : "video", "updated_time" : "2012-07-18T04:32:24+0000", "shares" : { "count" : 40 }, "id" : "279771168740729_203743029752972", "message" : "Like Raghu, you too could be at the wrong place at the wrong time. But would you be able to clear your unpaid bills like Raghu did? Now you can! To know how, check out this video. For more details, visit http://bit.ly/NsoCY3", "picture" : "http://external.ak.fbcdn.net/safe_image.php?d=AQADR4-ELAVCbuSI&w=130&h=130&url=http%3A%2F%2Fi2.ytimg.com%2Fvi%2FeKxIbLVRHRE%2Fmqdefault.jpg", "source" : "http://www.youtube.com/v/eKxIbLVRHRE?version=3&autohide=1&autoplay=1", "status_type" : "shared_story", "likes" : { "count" : 643, "data" : [ { "id" : "100002247030669", "name" : "Angel Zoya" }, { "id" : "100002257585478", "name" : "Rakesh Kumar" }, { "id" : "100002062205767", "name" : "P.k. Choudhury" }, { "id" : "100000484071154", "name" : "Balaji Jadhvar" } ] }, "name" : "ICICI Bank", "page_id" : "279771168740729", "page_name" : "ICICI Bank", "created_time" : "2012-07-18T04:32:24+0000", "comments" : { "count" : 48 }, "actions" : [ { "link" : "http://www.facebook.com/279771168740729/posts/203743029752972", "name" : "Comment" }, { "link" : "http://www.facebook.com/279771168740729/posts/203743029752972", "name" : "Like" } ] }
    { "_id" : "279771168740729_203743029752972_572142", "user_likes" : false, "message" : ":-)", "like_count" : 4, "page_username" : "icicibank", "page_id" : "279771168740729", "can_remove" : false, "from" : { "id" : "1060073189", "name" : "Raja Bhowmik" }, "page_name" : "ICICI Bank", "post_id" : "279771168740729_203743029752972", "created_time" : "2012-07-18T04:33:57+0000" }
    { "_id" : "279771168740729_203743029752972_572155", "user_likes" : false, "message" : "@?", "like_count" : 4, "page_username" : "icicibank", "page_id" : "279771168740729", "can_remove" : false, "from" : { "id" : "100001965306815", "name" : "Akhil Pandit" }, "page_name" : "ICICI Bank", "post_id" : "279771168740729_203743029752972", "created_time" : "2012-07-18T04:39:55+0000" }
    { "_id" : "279771168740729_203743029752972_572157", "user_likes" : false, "message" : "This ad is in very bad taste given the timing of it's release and the passing away of Satwik in the Bannerghata forests in Bangalore. Maybe there is no relation, but the similarity of the situation is uncanny.", "like_count" : 4, "page_username" : "icicibank", "page_id" : "279771168740729", "can_remove" : false, "from" : { "id" : "588391958", "name" : "Vijay Alphonse" }, "page_name" : "ICICI Bank", "post_id" : "279771168740729_203743029752972", "created_time" : "2012-07-18T04:41:05+0000" }
    { "_id" : "279771168740729_203743029752972_572182", "user_likes" : false, "message" : "Lv 2 do job in a bank", "like_count" : 6, "page_username" : "icicibank", "page_id" : "279771168740729", "can_remove" : false, "from" : { "id" : "100002492179903", "name" : "Monica Chandwani" }, "page_name" : "ICICI Bank", "post_id" : "279771168740729_203743029752972", "created_time" : "2012-07-18T04:48:51+0000" }

{ "_id" : "279771168740729_203743029752972_572228", "user_likes" : false, "message" : "R u working in ici bnk", "like_count" : 4, "page_username" : "icicibank", "page_id" : "279771168740729", "can_remove" : false, "from" : { "id" : "100002412887446", "name" : "Brijesh Gaur" }, "page_name" : "ICICI Bank", "post_id" : "279771168740729_203743029752972", "created_time" : "2012-07-18T05:10:06+0000" }

Here, I need to show the top 2 posts based on the count of the likes (like_count key's value). So the post with id 279771168740729_203743029752972_572182 will be first(6 is highest like count), with id 279771168740729_203743029752972_572142 second (4 is the next highest) and so on.

I came up with two steps :

  1. Emit the likeCount and the postId
  2. Sort the likeCount descending and show first two entries

Accordingly :

var mapFunction = function() {
    var likeCount = this.like_count;
    var postId = this._id;

    if(postId != null && likeCount  !=  null){
        emit(likeCount, postId);
    }
};

var reduceFuntion = function(likeCount, postIdCollection) {
/*How to maintain a single sorted list of likeCount and show the corresponding post?*/

};

I'm already confused about the sort functionality as per the mongo db doc. - please refer this post

Community
  • 1
  • 1
Kaliyug Antagonist
  • 3,512
  • 9
  • 51
  • 103

1 Answers1

1

Unless you are actually planning to do other things with the MapReduce functionality, you would probably be better off just using a plain Mongo query for this. Your best bet is to just use a find query:

db.collectionName.find().sort({ like_count: -1 }).limit(2);

I would also recommend and index on the like_count column if you're dealing with large quantities of data:

db.collectionName.ensureIndex({like_count: -1})

If you are really keen to do it with map reduce, then you are probably going to want to use the sort an limit options on the map reduce command

db.collectionName.mapReduce(mapFunction, reduceFunction, { sort: {like_count: -1, limit: 2}})

which essentially perform the same query on the data set going in, and then chop it on the way going out, however, this means the MapReduce step it not doing very much for you.

If you want to try and do it with pure MapReduce, then you need a totally different approach to your map and reduce functions. The MapReduce process has an implicit sort in it on the key, which means, you can run something like this:

var mapFunction = function() {
    var likeCount = - this.like_count;
    var postId = this._id;

    if(postId != null && likeCount !=  null){
        emit(likeCount,postId);
    }
};

var reduceFunction = function(a,b) {
    var out = b.join();
    return(a, out);
};

db.test.mapReduce(mapFunction, reduceFunction, {out: { inline: 1 }, limit: 2});

and then process the last members of the resulting set to grab the entries from the end, and expand back out to grab the posts, though you're going to need to do some unwinding on that result set to make it sensible. Note that because the implicit sort order is increasing we are actually emitting negative like_count, not positive, which means we can use limit. This is not strictly the top two posts, but the top two like_count values and all posts associated with them, so you will still need some post processing.

Of course you could also use the Aggregation framework if you wanted to try another method:

db.collectionName.aggregate([{$sort: { like_count: -1 }}, {$limit: 2}]);
Simon Elliston Ball
  • 4,375
  • 1
  • 21
  • 18
  • I'm aware of the approach that you have suggested but I want to try it out the MR way – Kaliyug Antagonist Aug 17 '13 at 10:22
  • 1
    Updated to include a map reduce based example, which uses the implicit sort in map reduce. I still reckon you're better off using the query method for real use though :). – Simon Elliston Ball Aug 17 '13 at 11:21
  • I executed your code,please confirm the correctness of my understanding- the join function creates a comma-separated collection of all the post ids pertaining to a count. One step remains - extracting the top n documents. I guess a finalize() function is what is required which would do the job - can you provide any inputs for that? – Kaliyug Antagonist Aug 17 '13 at 11:50