0

I have 100K tweets stored in MongoDB. Each tweet is stored like the following:

{
    "_id" : "123456789",
    "user_screenName " : "john doe",
    "text" : "some tweet"
}

I have found http://bdadam.com/blog/finding-a-random-document-in-mongodb.html and MongoDB: how to find 10 random document in a collection of 100? but not sure if this is exactly what I need.

I want to get 200 random text fields so I can analyze.

SFC
  • 733
  • 2
  • 11
  • 22

1 Answers1

3

You can use the $sample stage for that.

db.collection.aggregate({
    $sample: { size: 200 } // select 200 random documents
}, {
    $project: {
        "_id": 0, // exclude "_id"
        "text": 1 // include "text"
    }
})

Also, MongoDB Compass provides quite some nice functionality around analyzing existing data.

dnickless
  • 10,733
  • 1
  • 19
  • 34