0

I have been reading about indexing in mongoDB to improve query performance. I have found many useful resources online.

From the mongoDB docs here

Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect

I understand the above completely.I found another resource which was amazingly helpful here

Here they try to find from_user "paasdude" without an index.

db.tweets.find({'from_user':'paasdude'}).explain();
{
"cursor" : "BasicCursor",
"nscanned" : 51748,
"nscannedObjects" : 51748,
"n" : 35,
"millis" : 40,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
   }
 }

Here they add an index to the "from-user" field, from my understanding the value "1" means sort it in ascending order.

db.tweets.ensureIndex({'from_user' : 1});

Here they try to find the from_user "paasdude" with an index.

db.tweets.find({'from_user':'paasdude'}).explain();
{
  "cursor" : "BtreeCursor from_user_1",
  "nscanned" : 35,
  "nscannedObjects" : 35,
  "n" : 35,
  "millis" : 3,
  "nYields" : 0,
  "nChunkSkips" : 0,
  "isMultiKey" : false,
  "indexOnly" : false,
  "indexBounds" : {
    "from_user" : [
        [
            "paasdude",
            "paasdude"
        ]
    ]
  }
}

It is clear that after adding the index query time went from 40 milliseconds to 3.

My Questions:

Although I understand the concept and need for indexing I don't understand how it actually worked.

  1. By giving the "from_user" field the an index of 1 did it sort all the from_user fields in ascending order?

  2. Does indexing only work in ascending (1) or descending (-1)?

  3. Why by simply adding an index of 1 to the from_user field drop the query time down from 40 to 3 milliseconds?

  4. When should indexes be used?

NOTE: I apologise if this question is off-topic for stack overflow. This is a more conceptual question and I wasn't sure where else to ask it. If you know a better place to ask this question. Please let me know and I will move it.

Community
  • 1
  • 1
Skywalker
  • 4,984
  • 16
  • 57
  • 122

2 Answers2

2

You already answered part of your own question. Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect. To give a concrete example, you probably used an index to access the mongoDB documentation rather than reading the documentation from start to finish and that must have saved you quite a bit of time.

Regarding the sort order, in your case the sort order doesn't make much of a difference. The sort order is becomes more relevant if a query returns a large number of records and the output is sorted by index fields. In that scenario the query execution will be faster if the index order matches the sorting order.

It is a bit of black art when you need to create an index, however, as a rule of thumb you should add an index on the field if you are filtering on it frequently and your query times are (getting) slow. As indexes require storage space and need be kept up to date when new records are inserted it is generally good practice to limit the number of indices.

Alex
  • 21,273
  • 10
  • 61
  • 73
  • Thank you so much for your answer. So basically indexing in mongoDB is more used with numerical data such as dates, ids, age, birth date etc. My database has more text data ( as in Strings data type ). And all my queries would be related to 'find projects that contain the healthcare field' etc. For this kind of queries should I NOT use indexing as its more text data and not numerical data? – Skywalker Oct 22 '15 at 10:27
  • MongoDB also has text indexes to support text search of string content in documents of a collection. This type of index might be more beneficial if you are querying for part of strings, see https://docs.mongodb.org/manual/core/index-text/ – Alex Oct 22 '15 at 10:40
  • this is amazing, very helpful Jaco thank you. Just a few questions, if one was looking to create a search bar whose function is specially search for projects in the projects collection in mongoDB. Can this text indexes approach be used? I am guessing once the user enter the search text and presses search button a HTTP call would be made to the mongoDB database which would use the text index set to find the related data and send it back? Is this a good approach in terms of performance of a systems search/query capability? And again thank you! – Skywalker Oct 22 '15 at 10:59
  • Yes @LorenzovonMatterhorn, that sounds like a sensible approach, however, please not that you can only have one full text searchable field per collection. Have a look at this an example demonstration Mongo's full text abilities using Node.js: https://www.compose.io/articles/full-text-search-with-mongodb-and-node-js/ – Alex Oct 22 '15 at 12:03
0
    1. Yes.
    1. Yes.
    1. Because mongo doesn't have to "search" the documents for the criteria. It looks up on the index and jumps to the address. Look at this SO-Question for a good overview.
    1. Hard to answer. Short: If you have a lot of read operations and not so much write operations.
Community
  • 1
  • 1
Clijsters
  • 4,031
  • 1
  • 27
  • 37
  • Thank you for your answer. Could you explain a little more on the 3rd point. I am trying to understand how it works behind the scenes. Once I understand that I'll be able to improve my database models and queries. Again thank you. – Skywalker Oct 22 '15 at 10:19
  • I edited my answer and added a link to another StackOverflow Question where this is explained. If you need more information, let me know. – Clijsters Oct 22 '15 at 10:21