6

I have a MongoDB collection of documents having two properties: type & value.

[
  {type: "A", value: "1"},
  {type: "A", value: "2"},
  {type: "B", value: "1"},
  {type: "B", value: "2"},
  {type: "C", value: "1"},
  {type: "C", value: "2"}
]

How can I get one random document of each type using a single query without any JavaScript involved?

I tried to figure something out using the aggregation framework

db.collection.aggregate([
  {$group: {_id: "$type", item: {$push: "$$ROOT"}}},
  {$sample: {size: 1}}
]);

which does not apply the sampling on each group but simply selects one of the groups.

Mouz
  • 273
  • 2
  • 13

3 Answers3

2

Alternatively, you can iterate on all grouped elements and process the random in another way than $sample :

db.collection.aggregate([
    { $group: { _id: "$type", item: { $push: "$$ROOT" } } }
]).forEach(function(data) {
    var rand = data.item[Math.floor(Math.random() * data.item.length)];
    // insert in a new collection if needed
    // db.results.insert(rand);
    printjson(rand);
});

To pick a random element from an array : Getting a random value from a JavaScript array

This solution can be slow as it doesn't use aggregation if you have a large distinct value of type in your collection

Community
  • 1
  • 1
Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159
  • I would like to achieve this exclusively in the query, not by processing the query result with some extra code. – Mouz Feb 23 '17 at 12:32
0

Not sure if this achieves what you are looking for, but I have been trying to do something similiar, where I need to find a group of data that matches my criteria and then get a random sample from that.

I have been able to acheive this now using $match and $sample, along the lines of the below:

db.collection('collectionname').aggregate(
{$match : {'type': 'B'}},
{$sample: {size: 1}}
);
Andrew
  • 75
  • 1
  • 11
  • I guess this will only return a sample of B type objects, not a sample of each type. – Mouz Feb 23 '17 at 12:30
0

Using the new mapReduce framework, one way is to do this:

db.coll.mapReduce(
  /* group by type */
  function() { emit(this.type, this.value); },
  /* select one random index */
  function(u, vs) { return vs[Math.round(Math.random() * vs.length)]; },
  /* return the results directly, or use {out: "coll_name" } */
  {out: {inline: 1}} 
)

The out parameter can also be the name of a collection. Result:

{
    "results" : [
        {
            "_id" : "A",
            "value" : "2"
        },
        {
            "_id" : "B",
            "value" : "2"
        },
        {
            "_id" : "C",
            "value" : "1"
        }
    ],
    "timeMillis" : 24,
    "counts" : {
        "input" : 6,
        "emit" : 6,
        "reduce" : 3,
        "output" : 3
    },
    "ok" : 1
}
Derlin
  • 9,572
  • 2
  • 32
  • 53