We've recently hit the >2 Million records for one of our main collections and now we started to suffer for major performance issues on that collection.
They documents in the collection have about 8 fields which you can filter by using UI and the results are supposed to sorted by a timestamp field the record was processed.
I've added several compound indexes with the filtered fields and the timetamp e.g:
db.events.ensureIndex({somefield: 1, timestamp:-1})
I've also added couple of indexes for using several filters at once to hopefully achieve better performance. But some filters still take awfully long time to perform.
I've made sure that using explain that the queries do use the indexes I've created but performance is still not good enough.
I was wondering if sharding is the way to go now.. but we will soon start to have about 1 million new records per day in that collection.. so I'm not sure if it will scale well..
EDIT: example for a query:
> db.audit.find({'userAgent.deviceType': 'MOBILE', 'user.userName': {$in: ['nickey@acme.com']}}).sort({timestamp: -1}).limit(25).explain()
{
"cursor" : "BtreeCursor user.userName_1_timestamp_-1",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : 30060,
"nscanned" : 30060,
"nscannedObjectsAllPlans" : 120241,
"nscannedAllPlans" : 120241,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 1,
"nChunkSkips" : 0,
"millis" : 26495,
"indexBounds" : {
"user.userName" : [
[
"nickey@acme.com",
"nickey@acme.com"
]
],
"timestamp" : [
[
{
"$maxElement" : 1
},
{
"$minElement" : 1
}
]
]
},
"server" : "yarin:27017"
}
please note that deviceType has only 2 values in my collection.