1

I searched for similar questions but couldn't find any. Feel free to point me in their direction.

Say I have this data:

{ "_id" : ObjectId("5694c9eed4c65e923780f28e"), "name" : "foo1", "attr" : "foo" }
{ "_id" : ObjectId("5694ca3ad4c65e923780f290"), "name" : "foo2", "attr" : "foo" }
{ "_id" : ObjectId("5694ca47d4c65e923780f294"), "name" : "bar1", "attr" : "bar" }
{ "_id" : ObjectId("5694ca53d4c65e923780f296"), "name" : "bar2", "attr" : "bar" }

If I want to get the latest record for each attribute group, I can do this:

> db.content.aggregate({$group: {_id: '$attr', name: {$last: '$name'}}})
{ "_id" : "bar", "name" : "bar2" }
{ "_id" : "foo", "name" : "foo2" }

I would like to have my data grouped by attr and then sorted by _id so that only the latest record remains in each group, and that's how I can achieve this. BUT I need a way to avoid naming all the fields that I want in the result (in this example "name") because in my real use case they are not known ahead.

So, is there a way to achieve this, but without having to explicitly name each field using $last and just taking all fields instead? Of course, I would sort my data prior to grouping and I just need to somehow tell Mongo "take all values from the latest one".

metame
  • 2,480
  • 1
  • 17
  • 22
slouc
  • 9,508
  • 3
  • 16
  • 41
  • 1
    Field-names which are not known ahead are an anti-pattern in MongoDB which leads to all kinds of unsolvable problems and should be avoided when possible. – Philipp Jan 20 '16 at 14:37
  • They are not completely arbitrary, it's just that I have a collection of items which can belong to one of two categories. Like a "vehicle" collection which contains both trucks and cars. Should I reorganize it so that all fields are exactly the same throughout all records in a collection? – slouc Jan 20 '16 at 14:39
  • The schemaless nature of MongoDB allows you to have "optional" fields which only exist in specific types of documents, but when you have fields which mean the same thing in different types, they should have the same name. Otherwise you run into this (and many other) problems. – Philipp Jan 20 '16 at 14:42
  • No, there are no fields which mean the same thing in different types. My two types of items overlap in 70% of attributes, and the rest are optional and specific to each type. But I would like to avoid logic and hardcoding of attribute names in my service. I'd like to take "all there is", if possible. – slouc Jan 20 '16 at 14:48

1 Answers1

1

See some possible options here:

  • Do multiple find().sort() queries for each of the attr values you want to search.
  • Grab the original _id of the $last doc, then do a findOne() for each of those values (this is the more extensible option).
  • Use the $$ROOT system variable as shown here.

This wouldn't be the quickest operation, but I assume you're using this more for analytics, not in response to a user behavior.

Edited to add slouc's example posted in comments: db.content.aggregate({$group: {_id: '$attr', lastItem: { $last: "$$ROOT" }}}).

Community
  • 1
  • 1
metame
  • 2,480
  • 1
  • 17
  • 22
  • I had no idea about $$ROOT, that helped me. Example for others: db.content.aggregate({$group: {_id: '$attr', lastItem: { $last: "$$ROOT" }}}). Note that if you're using ReactiveMongo (like I did) you need to use only one dollar sign. – slouc Jan 20 '16 at 16:05