4

I use Mango queries with Couchdb when users want to search the database. They can search by multiple fields in the document.

To simplify I have a document with three fields all of which can be searched to find the document.

FieldOne: cow
FieldTwo: pig
FieldThree: dog

When creating a mango index for the query what fields should I use? Should I create and index with multiple fields? Does the order matter?

There are also three different document types (A, B C) defined in a field. I only need to search one, is it possible to exclude the other types from the index to make it faster?

Field_Type: A

It would make sense to me to be able to run an index against a view to only search through the documents I am interested in. Is this possible?

Example indexes

One index on field known to appear in the query

  {
       "index": {
          "fields": [
             "FieldOne"
          ]
       },
       "name": "foo-json-index",
       "type": "json"
    }

Multiple indexes, not sure if used or not?

 {
       "index": {
          "fields": [
             "FieldOne",
             "FieldTwo",
             "FieldThree"
          ]
       },
       "name": "foo-json-index",
       "type": "json"
    }

Or multiple indexes to choose the correct one when building the query?

which is the correct approach to get the fastest search results?

R.Meredith
  • 184
  • 2
  • 10
  • Do you have an example of the query selector(s) you are seeking to optimise. It's hard to suggest an index that will cover arbitrary queries over multiple fields but if you have a specific query to optimise I may be able to advise. – Will Holley Nov 30 '17 at 16:46
  • The query is very variable over multiple fields, it is used by backoffice staff so I don't think it can be easily optimized as they seem to want to search by everything. thanks for your offer and answer it cleared up a lot of questions in my head. @WillHolley – R.Meredith Dec 01 '17 at 10:36

1 Answers1

8

I don't think you can easily solve this in the general case - you'd need a specific selector (or set of selectors) to optimise for. However, there are 3 aspects to consider when creating an index that may help:

  1. When you create a Mango index, only documents containing all of the indexed fields are included in the index. The second index example you cite, containing all 3 fields, is perfectly fine so long as you are always going to specify all 3 fields in the query selector.

  2. At query time, an index can only be used if all it's indexed fields are required to exist according to the selector. For example, if your index contains fields A and B but you only query for A, we can't use the index because it won't include documents that contain A but not B.

  3. Order of fields in the index matters. An index can only be used if it can find a contiguous range of values for the indexed fields - the same as a compound key for a _view. For example, let's say you have an index on fields A and B and the documents [{A:1,B:1},{A:1,B:2},{A:2,B:1},{A:2,B:2}]. The index will look like: [[1,1],[1,2],[2,1],[2,2]]. If your query is then A >= 1 AND B == 2, the matching documents ([1,2] and [2,2]) the only range that covers all matching documents is [1,2],[2,1],[2,2]] - the value [2,1] would need to be filtered out in memory.

You can see what index is used to fulfil a query using the _explain endpoint, and that should give you some clue as to how selective the index is able to be for a given selector.

Will Holley
  • 1,745
  • 10
  • 11
  • Sorry, I'm a bit confused. Is `A > 1 AND B == 2` accurate? Shouldn't it be `A >= 1 AND B == 2`? The matching documents are `[1,2]` and `[2,2]`, therefore `A` can be both `1` and `2`. – Megidd Apr 07 '18 at 08:10