3

I am new to MongoDB, I have started using it since I have a requirement where there are around 300 Million documents and have to perform some queries on them. So I have created a collection where the structure resembles:

LogsCollection:

{  LogID, LogName, Version, Serial, Year, Month, Day, Feature { FeatureID, Name, Hour, Minute, second, millisecond  }}

I have inserted 300 Million documents into the collection using C# drvier. So each document is a BSONDocument type.

Now I am trying to query the number of documents with the Year - 2012. The query time is more than 15 mins. Is this the expected behavior for the 300 Million documents I have inserted or Is mongoDB expected to give better performance?

I am also doubting whether the structure I have created in the collection is correct. Can anyone guide me with this?

The queries are basically based on the Date or time and FeatureID.

Salman Mohammad
  • 182
  • 1
  • 14
user2439903
  • 1,277
  • 2
  • 34
  • 68
  • 1
    You should create an index on year if you want to run queries on it. Like `db.logs.ensureIndex({'Year': 1})`. – Brett Nov 07 '13 at 07:55
  • Could you provide output from `explain`? – zero323 Nov 07 '13 at 07:55
  • Thanks for the quick reply. My queries may be on different parameters like-**Year, month, day, and even on FeatureID / Name**. So having Index on One field like **Year** might help? The queries may also be based on 2 or 3 fields. It changes each time. So how should the indexing be done? – user2439903 Nov 07 '13 at 08:13
  • did you find an answer for that? – Erez Jul 06 '16 at 09:37

1 Answers1

4

For sure this is not an expected behavior.

I would recommend to do some changes: because you are new to mongodb I assume that you do not have indexes on your documents, and therefore it makes a full scan (check every document). It is a good practice to have indexes on the keys you are going to do frequent search. So do the following:

db.logs.ensureIndex({'Year': 1})
db.logs.ensureIndex({'FeatureID': 1})

Another thing, I would recommend to convert this date/time keys into Date() fields and then perform time range queries.

But at the beginning just try to make an index and see the performance. Do not forget about explain operator to understand what mongodb is doing behind the hood.

P.S. after your commend about querying on different time options, I would actually suggest to convert to mongo date. You can look at my previous answer how to do something like this (surely you need to modify it to make what you want, but the idea is the same).

Community
  • 1
  • 1
Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • @ Salvador Dali:I did create Index on the Date column and did convert the date/time keys to Date() field.I have used the C# driver to create the BsonDocument and insert it into Logs collection.Logs collection has the following fields: LogID, LogName, Version, Serial, Date, FeatureID, FeatureName. There are 30,00,00,000 records in the log table. After indexing also, the time taken for this query is more than an hour- **var start= new Date(2010,1,1); var end=new Date(2011,1,1); db.LogsTable.find({DateTime:{$gte:start, $lt:end}}).count()**.Can anyone let me know whats wrong with the collection? – user2439903 Nov 11 '13 at 04:29
  • @user2439903 it is hard to tell what is wrong here. I would suggest you to create a new question where you would include explain for your query. This would give some information what is wrong – Salvador Dali Nov 11 '13 at 05:25