1

Here is my collection which has more than 1 billion documents in it and each document has size of 2 KB

sample document
{
    "sensor" : "temperature",
    "timestamp" : ISODate("2018-02-22T01:55:00Z"),
    ... some more key values
}

I have index on key sensor and timestamp individualy Find query

db.collection.find({"sensor":"temperature", "timestamp":{"$gte":ISODate("2018-02-22T00:00:00Z"), 
                                          "$lt":ISODate("2018-02-23T00:00:00Z")}})

But it takes minutes to respond around more than 20-30 minutes which is not acceptable for my application. So what is the better way to optimize this query so it will results in less than 30 seconds (because mongoose will return timeout error if query is taking more than 30 seconds). Is there any solution for optimizing query except sharding?

Krunal Sonparate
  • 1,122
  • 10
  • 29
  • First thing to try would be a composite key over `sensor, timestamp` (and maybe another key after that). How many results will you get back form that query? – Thilo Feb 22 '19 at 07:56
  • Have you tried [explaining results](https://docs.mongodb.com/manual/reference/explain-results/) and then apply the appropriate indexes? – chridam Feb 22 '19 at 07:57
  • Also, sharding does not improve performance that much, it is more about scaling. With X shards, unindexed queries drop to best case 1/X, meaning you go from 30 minutes to 5-10 minutes. – Thilo Feb 22 '19 at 07:58
  • @Thilo I will get around 1440 resulted documents from that query – Krunal Sonparate Feb 22 '19 at 07:58
  • 1440 after you apply those extra keys you left out in the example. How many without? You may need to add one of those to the index as well. – Thilo Feb 22 '19 at 07:59
  • Useful posts https://www.sitepoint.com/7-simple-speed-solutions-mongodb/ and https://docs.mongodb.com/manual/tutorial/optimize-query-performance-with-indexes-and-projections/ – chridam Feb 22 '19 at 08:00
  • I would suggest you add an index on `timestamp` (or perhaps `timestamp`, `sensor`) as it will likely provide the biggest selectivity (also see this: https://stackoverflow.com/questions/33545339/how-does-the-order-of-compound-indexes-matter-in-mongodb-performance-wise). This index will be big, though, so you need to have quite some RAM in order for MongoDB to be able to keep it loaded. – dnickless Feb 22 '19 at 08:42
  • for a billion of docs, may be you should use cursor. – Saikat Chakrabortty Feb 22 '19 at 10:28
  • @saikatchakrabortty Thanks for the suggestion, I have tried using cursor but iterating cursor takes long time – Krunal Sonparate Feb 22 '19 at 14:38
  • @dnickless Thanks for the suggestion. I have index on timestamp and sensor but individual. so are you trying to suggest combined index of timestamp and sensor?? – Krunal Sonparate Feb 22 '19 at 14:40
  • It depends on your data distribution with respect to those two properties, I suppose. But I would try both versions anyway. and use .explain() to see what's going on. – dnickless Feb 22 '19 at 14:46

0 Answers0