I have a mongo collection with documents. There is one field in every document which is 0 OR 1. I need to random sample 1000 records from the database and count the number of documents who have that field as 1. I need to do this sampling 1000 times. How do i do it ?
-
possible duplicate of [Random record from MongoDB](http://stackoverflow.com/questions/2824157/random-record-from-mongodb) – Amir Ali Akbari Mar 25 '15 at 13:27
5 Answers
For people coming to the answer, you should now use the new $sample
aggregation function, new in 3.2.
https://docs.mongodb.org/manual/reference/operator/aggregation/sample/
db.collection_of_things.aggregate(
[ { $sample: { size: 15 } } ]
)
Then add another step to count up the 0
s and 1
s using $group
to get the count. Here is an example from the MongoDB docs.

- 3,467
- 5
- 31
- 38
For MongoDB 3.0 and before, I use an old trick from SQL days (which I think Wikipedia use for their random page feature). I store a random number between 0 and 1 in every object I need to randomize, let's call that field "r". You then add an index on "r".
db.coll.ensureIndex(r: 1);
Now to get random x objects, you use:
var startVal = Math.random();
db.coll.find({r: {$gt: startVal}}).sort({r: 1}).limit(x);
This gives you random objects in a single find query. Depending on your needs, this may be overkill, but if you are going to be doing lots of sampling over time, this is a very efficient way without putting load on your backend.

- 9,401
- 7
- 53
- 76
-
-
@Sklavit if you generate a new random number each time your sample will be different – Nic Cottrell Jun 24 '18 at 13:48
-
Here's an example in the mongo
shell .. assuming a collection of collname
, and a value of interest in thefield
:
var total = db.collname.count();
var count = 0;
var numSamples = 1000;
for (i = 0; i < numSamples; i++) {
var random = Math.floor(Math.random()*total);
var doc = db.collname.find().skip(random).limit(1).next();
if (doc.thefield) {
count += (doc.thefield == 1);
}
}

- 63,885
- 14
- 149
- 175
-
This also answers one other question: that unlike SQL, MongoDB does not have a built in function for this really. Also that skip could (...could) become troublesome for larger random values, depends though. – Sammaye Oct 01 '12 at 13:59
-
Just a warning that calling skip with very large values can cause a lot of work on the server side, and can be pretty slow, right @Stennie? – Nic Cottrell Jun 18 '22 at 16:45
I was gonna edit my comment on @Stennies answer with this but you could also use a seprate auto incrementing ID index here as an alternative if you were to skip over HUGE amounts of record (talking huge here).
I wrote another answer to another question a lot like this one where some one was trying to find nth record of the collection:
php mongodb find nth entry in collection
The second half of my answer basically describes one potential method by which you could approach this problem. You would still need to loop 1000 times to get the random row of course.
If you are using mongoengine, you can use a SequenceField to generate an incremental counter.
class User(db.DynamicDocument):
counter = db.SequenceField(collection_name="user.counters")
Then to fetch a random list of say 100, do the following
def get_random_users(number_requested):
users_to_fetch = random.sample(range(1, User.objects.count() + 1), min(number_requested, User.objects.count()))
return User.objects(counter__in=users_to_fetch)
where you would call
get_random_users(100)

- 549
- 4
- 18