0

There will be about 3 billion docs in the collection across shards. Assuming I would be using queries that is not fulfilled entirely by the index(es). And I am not using the keys for sorting.

Does it make sense to have indexes as follows (which match the queries):

{"sid":1, "cid":1, "mid":1}
{"cid":1, "hid":1, "mid":1}
{"mid":1, "hid":1, "sid":1}
{"hid":1, "sid":1, "cid":1, "mid":1}
{"mid":1, "cid":1}
{"mid":1, "sid":1}

Or:

{sid:1}
{cid:1}
{hid:1}
{mid:1} 
MikeJavadi
  • 21
  • 1

1 Answers1

1

This is a valid question. You are not using these keys for sorting so I will assume your queries contain no sort, since you must be aware (to have said what you did) that index inter-sectioning cannot work across sorts.

I would say: better to have the compound indexes.

Even though it is more indexes to maintain it makes for straight answers to your queries without the problematic calculations required for inter-sectioning. I mean you are doing this over 3.5b records, that won't end well.

The compound index approach puts the weight of maintenance on the insertion/updating side which is much better than on the query side. Even if the indexes only partially fulfill the queries at any given point in time.

It seems that MongoDB's intersectioning has got more advanced:

In general, each index intersection involves two indexes; however, MongoDB can employ multiple/nested index intersections to resolve a query.

http://docs.mongodb.org/manual/core/index-intersection/

So even though it can use more than two indexes the usage of 3 intersections would still cause a nested intersection to occur which doesn't sound pleasant.

I did answer a similar question some time ago whereby it was said by 10gen that inter-sectioning is essentially a last resort, not something that is desirable.

Sammaye
  • 43,242
  • 7
  • 104
  • 146
  • Thank you for your answer. I don't know how the intersection would perform since (at least I am assuming) that the 3 indexes would get loaded into memory anyway. So in theory, the only thing that I would be saving is some disk. I am dealing with some shards that still need to be balanced and I supposed the indexes will further slow down the balancer (when I restart it). (i did not have good input on shard key data distribution so that I could pre-split the chunks better. When did you hear that from 10-gen? v2.4 or v2.6 ? – MikeJavadi May 06 '15 at 19:58
  • @MikeJavadi intersections are a lot heavier than just whether or not you have the index in memory. I mean to intersect you have to perform an array intersect on each of the indexes slowly winding them down to the answer – Sammaye May 06 '15 at 20:02
  • @MikeJavadi as to hearing it: https://jira.mongodb.org/browse/SERVER-3071 it is in the comments – Sammaye May 06 '15 at 20:03