Referencing the whole document in MongoDB Aggregation Pipeline

Question

I can reference the values of individual values of attributes in MongoDB aggregation pipeline using '$' operator. But, how do I access (reference) the whole document ?

UPDATE: An example provided to explain scenario.

Here's an example of what I'm trying to do. I have a collection of tweets. And every tweet has a member 'clusters', which is an indication of to what cluster a particular tweet belongs to.

{
    "_id" : "5803519429097792069",
    "text" : "The following vehicles/owners have been prosecuted by issuing notice on the basis of photographs on dated... http://t.co/iic1Nn85W5",
    "oldestts" : "2013-02-28 16:11:32.0",
    "firstTweetTime" : "4 hours ",
    "id" : "307161122191065089",
    "isLoc" : true,
    "powertweet" : true,
    "city" : "new+delhi",
    "latestts" : "2013-02-28 16:35:05.0",
    "no" : 0,
    "ts" : 1362081807.9693,
    "clusters" : [
        {
            "participationCoeff" : 1,
            "clusterID" : "5803519429097792069"
        }
    ],
    "username" : "dtptraffic",
    "verbSet" : [
        "date",
        "follow",
        "prosecute",
        "have",
        "be"
    ],
    "timestamp" : "4 hours ",
    "entitySet" : [ ],
    "subCats" : {
        "Generic" : [ ]
    },
    "lang" : "en",
    "fns" : 18.35967,
    "url" : "url|109|131|http://fb.me/2CeaI7Vtr",
    "cat" : [
        "Generic"
    ],
    "order" : 7
}

Since, there are some couple of hundred thousands tweets in my collection, I want to group all tweets by 'clusters.clusterID'. Basically, I would want to write a query like following:

db.tweets.aggregate (
{ $group : { _id : '$clusters.clusterID', 'members' : {$addToSet : <????> } } }
)

I want to access the presently processing document and reference it where I have put in the above query. Does anyone knows how to do this?

in a nutshell - no, there is no way to do this (there would be if you knew all the key names, but that's unlikely to be helpful). — Asya Kamsky, Feb 28 '13 at 20:03
you could do this in agg framework if you're willing to settle for a fixed set of fields of the original document. — Asya Kamsky, Mar 01 '13 at 00:49
@AsyaKamsky Yes, you are right. But that is clunky, since i have tens of fields in the my document. And what if some fields are absent from some documents. There must be a mechanism to access the the whole document! — VaidAbhishek, Mar 01 '13 at 11:17

score 25 · Answer 1 · edited Apr 04 '20 at 08:35

25

Use the $$ROOT variable:

References the root document, i.e. the top-level document, currently being processed in the aggregation pipeline stage.

edited Apr 04 '20 at 08:35

Xavier Guihot

54,987
21
291
190

answered Feb 18 '15 at 11:00

Volox

1,068
1
12
23

1

this question was asked when MongoDB 2.2 was current - $$ROOT was added in version 2.6 (early 2014) – Asya Kamsky Dec 26 '15 at 22:51
1

maybe you could respond [this question of mine](http://stackoverflow.com/questions/39288087/mongodb-collection-with-different-language-texts-select-localized-texts). The problem is that I would like to obtain the document itself, not as a subdocument, kind of `{ $group: $$ROOT }` which is not possible, and for the moment it can just be as a subdocument: `{ $group: { _id: '$$ROOT' } }` – Miquel Sep 02 '16 at 10:07
How to make this work when using a projection first? – Dane411 Jun 30 '17 at 21:07

score 2 · Answer 2 · answered Mar 01 '13 at 13:46

There is currently no mechanism to access the full document in aggregation framework, if you only needed a subset of fields, you could do:

db.tweets.aggregate([ {$group: { _id: '$clusters.clusterID',
                                  members: {$addToSet :  
                                       { user: "$user",
                                         text: "$text", // etc for subset 
                                                        // of fields you want
                                       }
                                  } 
                               } 
                       } ] )

Don't forget with a few hundred thousand tweets, aggregating the full document will run you into the 16MB limit for returned aggregation framework result document.

You can do this via MapReduce like this:

var m = function() {
  emit(this.clusters.clustersID, {members:[this]});
}

var r = function(k,v) {
  res = {members: [ ] };
  v.forEach( function (val) {
     res.members = val.members.concat(res.members);
  } );
  return res;
}

db.tweets.mapReduce(m, r, {out:"output"});

I had the same issue and BatScream offered the following solution. http://stackoverflow.com/questions/34404834/how-to-group-and-select-document-corresponding-to-max-within-each-group-in-mongo?noredirect=1#comment56552218_34404834. He suggested accessing full document via $$ROOT — user1700890, Dec 24 '15 at 15:06
$$ROOT was introduced in 2.6 and was not available at the time of this question/answer. https://jira.mongodb.org/browse/SERVER-9840 — Asya Kamsky, Dec 26 '15 at 22:50

Silver_Clash · Answer 3 · 2013-03-02T06:24:05.987

-2

I think MapReduce more useful for this task.

As written in the comments by Asya Kamsky, my example is incorrect for mongodb, please use official docs for mongoDB.

edited Mar 02 '13 at 06:24

answered Feb 28 '13 at 20:42

Silver_Clash

397
1
7

you're right that map/reduce can do this, but what you gave here will not work. Your map is slightly wrong, and your reduce function seems to be missing entirely. – Asya Kamsky Mar 01 '13 at 00:47
that's not how map/reduce works.Your reduce function has to return the same format that your map function emits, and it also may be called more than once. Your test may have given the "right" looking answer for some small test set, but it won't work correctly on real data. – Asya Kamsky Mar 01 '13 at 08:30
1

see the docs page for mapReduce. http://docs.mongodb.org/manual/reference/method/db.collection.mapReduce/#requirements-for-the-reduce-function lists both those facts (plus the fact that reduce won't be called at all for mapped keys that only occur once) – Asya Kamsky Mar 01 '13 at 13:48

Referencing the whole document in MongoDB Aggregation Pipeline

3 Answers3

Linked