1

Say I have a collection of people:

{ _id: ObjectId(1), name: "Bob" }
{ _id: ObjectId(1), name: "Alice" }
{ _id: ObjectId(2), name: "Bob" }
{ _id: ObjectId(2), name: "Alice" }
{ _id: ObjectId(2), name: "Charlie" }

I'm simplifying ObjectId(n) to represent n being the default timestamp mongo includes inside the ObjectId.

Now I have an application that requires fetching uniquely named people, showing only the records that was inserted the latest. So ultimately the app wants these records only:

{ _id: ObjectId(2), name: "Bob" }
{ _id: ObjectId(2), name: "Alice" }
{ _id: ObjectId(2), name: "Charlie" }

I don't want to have a unique index on name because I actually do want to store repeated people. The uniqueness constraint comes from application business logic irrelevant to the collection.

Is there a way I can query mongo to directly get these results, or is this logic something I have to write up in my application after retrieving all records? I don't have much mongoDB experience so there might be an obvious solution I'm unaware of.

niebula
  • 351
  • 1
  • 6
  • 13

1 Answers1

-1

You can write an aggregation query :

db.collection.aggregate([
  {
    $group: {
      _id: "$name", // group on name field
      doc: { $last: "$$ROOT" } // push last doc into a field `doc` (last inserted doc will come as last one in iteration)
    }
  },
  {
    $replaceRoot: { newRoot: "$doc" } // make `doc` field as new root of document
  }
])

Test : mongoplayground

Ref : aggregation-pipeline

whoami - fakeFaceTrueSoul
  • 17,086
  • 6
  • 32
  • 46
  • See my comment above. `$distinct` is simpler and does exactly what the user wants. – Joe Drumgoole Jun 16 '20 at 21:32
  • @JoeDrumgoole : I was aware of `.distinct()` would you be able to get latest document with `.distinct` ? I thought OP was asking to get latest document - was confused when he said *keeping latest record* if it does mean keeping it in DB then `.distinct()` is enough but if it mean to keeping latest docs in result then I thought aggregation can do it..!! – whoami - fakeFaceTrueSoul Jun 16 '20 at 21:36
  • 1
    Sorry for the confusion, reworded the question. By "keeping the latest record" I did mean keeping the docs in the result the application receives, not the actual DB. – niebula Jun 16 '20 at 21:57
  • I think it’s best to explicitly `$sort` the data prior to the `$group` stage – thammada.ts Jun 17 '20 at 05:16
  • @thammada : not sure why it’s needed, may be if you can explain it a bit I’ll update my answer accordingly.. – whoami - fakeFaceTrueSoul Jun 17 '20 at 06:05
  • @whoami see this [question](https://stackoverflow.com/q/11599069/1656749) for discussion on how the documents are sorted by default. It's an old question and not targeted specifically for aggregation, but I think it still applies. If the OP has a `$match` stage that is going to use a different index than `_id` then the order will be according to that index. – thammada.ts Jun 17 '20 at 08:18
  • @whoami Yes your answer worked, thank you. I'm also wondering about explicitly sorting. "(last inserted doc will come as last one in iteration)" how is this guaranteed? I'm reading the "group" docs and they mention "$group does not order its output documents." Do input documents have a guaranteed order though? (assuming I keep _id as the default ObjectID) – niebula Jun 17 '20 at 16:15