From http://docs.mongodb.org/manual/core/indexes/#multikey-indexes, it is possible to create an index on an array field using a multikey index. http://docs.mongodb.org/manual/applications/aggregation/#pipeline-operators-and-indexes lists some ways of how an index can be used in aggregation framework. However, there may be times that I may need to perform an $unwind
on an array field to perform a $group
. My question is, are multikey indexes (or any index using such array field) can still be used once they are operated on in the middle of the pipeline?

- 5,724
- 3
- 23
- 37
2 Answers
Generally, only pipeline operators that can be flattened to a normal query ($match
, $limit
, $sort
, and $skip
) will be able to use the indexes on a collection. This is one of the reasons the $geoNear
operator added in 2.4 has to be at the start of the pipeline.
Once you mutate the documents with $project
, $group
, or $unwind
the index is no longer valid/usable.
If you have an index on an array field you can still use it before the $unwind
to speed up the selection of documents to pipeline and then further refine the selected documents with a second $match
.
Consider documents like:
{ tags: [ 'cat', 'bird', 'blue' ] }
With an index on tags
.
If you only wanted to group the tags starting with b
then you could perform an aggregation like:
{ pipeline: [
{ $match : { tags : /^b/ } },
{ $unwind : '$tags' },
{ $match : { tags : /^b/ } },
/* the rest */
] }
The first $match
does the coarse grain match using the index on tags
.
The second match after the $unwind
won't be able to use the index (the document above is now 3 documents) but can evaluate each of those documents to filter out the extra documents that get created (to remove { tags : 'cat' } from the example).
HTH - Rob.

- 3,343
- 17
- 18
-
Thanks for the answer. However, I find "Once you mutate the documents with ... `$unwind` the index is no longer valid" contradicting the rest of the answer. Can you explain why is that so? – MervS Mar 25 '13 at 03:28
-
Sorry, should have been clearer. I will try to edit it in a second but the first match uses the index, the second will not. – Rob Moore Mar 26 '13 at 00:23
Hmm @Rob does give the right answer but I see how he could lead you down the wrong path a little:
If you have an index on an array field you can still use it before and after the $unwind to speed up the selection of documents to pipeline and then further refine the selected documents.
Basically the example he gives:
{ pipeline: [
{ $match : { tags : /^b/ } },
{ $unwind : '$tags' },
{ $match : { tags : /^b/ } },
/* the rest */
] }
Will not use a multikey index past $unwind
. So it will be able to search for all ROOT documents which have a tag name starting with b
however, it will not be able to $unwind
and then filter the subdocuments out in the second $match
using an index.
The $match
will only work on an index before the mutation.
So basically once you have mutated the document and loaded it onto the pipeline it becomes almost impossible to use an index currently.

- 43,242
- 7
- 104
- 146