0

My prod db is 300 gigs in size. Suppose I need to get the last 10 days record of a particular asset. I can send the entire 10 days in day from and day to range and mongo will send me all the data at once OR I can run a loop where I will send each day as day frm and day to range and mongo will send me the data. I have tried both but I can't determine the benefit of either. In some cases, I have figured sending request for each day is more beneficial than sending the whole request at a time.

I want to know which one is more efficient, what are the trade offs in between memory and cpu processing power.

Thank you in advance.

O_o
  • 1,103
  • 11
  • 36
  • 2
    You should paginate your queries so you get a set number of documents at a time: https://stackoverflow.com/questions/5049992/mongodb-paging – kmdreko Dec 31 '18 at 05:47
  • @kmdreko, I know this. I wanted to know which one is more efficient in terms of memory and cpu usage or what are the trade offs in between them – O_o Dec 31 '18 at 06:15
  • 1
    all at once would likely use more memory to process the request, but would use marginally less cpu since it only needs to parse one query. But I'm not an expert, if you have issues or concerns, you should test it yourself in your specific scenario – kmdreko Dec 31 '18 at 06:43
  • You need to test scenarios in your environment. It really depends on your server resources, indexes, how you are iterating, and the size of your query result set. Drivers retrieve data from query cursors in [batches](https://docs.mongodb.com/manual/reference/method/cursor.batchSize/), so for a large data set the database server won't be fetching all of the data at once. However, if your application is trying to store the full result set in memory (rather than iterating) there may be performance issues if the result set is significantly larger than available RAM on your application server. – Stennie Dec 31 '18 at 07:03
  • @Stennie, okay. I get it. But I actually test scenarios on my env. However, the problem is, I cannot determine before hand, which query is going to return a lot of data. There is no bench mark. Say for example: For one asset, there are only 1000 lines in JSON but for another one, there are over 100k or sometimes, JSON file becomes 3 to 4 MB in size. – O_o Dec 31 '18 at 07:38
  • You can control how much data your application is working with: a `find()` query returns a cursor which your application iterates. WIth the current information it is not clear how you are processing your query results, but it sounds like you are fetching all results into application memory rather than iterating or streaming the result set. You could make this question less abstract by including an example of your code for the two approaches you are trying to compare. – Stennie Dec 31 '18 at 23:26

0 Answers0