0

I have the following SQL (mysql) query, and I would like to convert it to MongoDB. It is basically getting the top 1 document in every group:

SELECT A.* 
FROM ads AS A
INNER JOIN (SELECT id 
            FROM ads
            WHERE userId = 1
            GROUP BY ad_code
            HAVING MAX(ad_timestamp)) AS B ON B.id = A.id

From what I have read so far there are several ways to aggregate data in MongoDB (more info at MongoDB aggregation comparison: group(), $group and MapReduce):

  • group (does not work on sharded collections)
  • MapReduce
  • $group

I am trying to solve this with the MongoDB aggregation framework. So far I have this:

db.ads.aggregate([
    { $match: { userId: ObjectId("5976e215769d8a4a4d75c514") } },
    { 
        $group: { 
            _id: "$ad_code", 
            latestTimestamp: { $max: "$ad_timestamp" },
        }
    }
])

But this does not return the _ids of the matching documents, just the ad code and max timestamp, so I cannot use the data to get whole documents.

This question looks very relevant, but it doesn't seem to solve the same issue I'm having: Query one document per association from MongoDB

Thanks

EDIT for duplicate flag: this question is different to others, and different to the question that I linked, in that the solution uses the $first operand to find a 1-match, and the docIds field to retrieve un-aggregated original documents. This is different to how other questions have approached solutions, partly as a result of the evolution of MongoDB over time.

mils
  • 1,878
  • 2
  • 21
  • 42

1 Answers1

1

It doesn't returns the _id because you didn't asked for it. try this query instead :

db.ads.aggregate([
    { $match: { userId: ObjectId("5976e215769d8a4a4d75c514") } },
    { $sort: {"ad_timestamp": -1}},
    { $group: { 
        _id: "$ad_code", 
        latestTimestamp: { $first: "$ad_timestamp" },
        docIds: {$first: "$_id"}
       }
    }
])

Here we first sort by ad_timestamp and get the first of each group with $first, and same for the _id

felix
  • 9,007
  • 7
  • 41
  • 62
  • thank you thank you thank you, this stuff is really scary at the beginning. – mils Jul 28 '17 at 06:19
  • And a quick follow-up question - does that mean that if I'm not explicitly using the "latestTimestamp" field, I don't need need to declare it? Or is it needed for the $first condition? Thanks – mils Jul 28 '17 at 06:23
  • 1
    @mils if your not using `latestTimestamp`, you can remove it from the **`$group`** stage, it won't affect the results – felix Jul 28 '17 at 06:25