0

I am attempting to implement partial, case-incensitive matching in Elasticseach 7.

I am creating the index with the settings:

{
  "merchant_3" : {
    "settings" : {
      "index" : {
        "number_of_shards" : "2",
        "provided_name" : "merchant_3",
        "max_result_window" : "100000",
        "creation_date" : "1592833582520",
        "analysis" : {
          "analyzer" : {
            "englishAnalyzer" : {
              "filter" : [
                "lowercase"
              ],
              "tokenizer" : "standard"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "5mjRMQ65TSGFFU0LfAH4eA",
        "version" : {
          "created" : "7060299"
        }
      }
    }
  }
}

and the mappings:

{
  "merchant_3" : {
    "mappings" : {
      "properties" : {
        "Name" : {
          "type" : "keyword"
        },
        ...
      }
    }
  }
}

The following query returns the document correctly:

POST /merchant/_search
{
  "query": {
    "wildcard": {
        "Name": "*Example*"
    }
  }
}

But when I lowercase the search term it does not return the document:

POST /merchant/_search
{
  "query": {
    "wildcard": {
        "Name": "*example*"
    }
  }
}

How do I configure Elasticsearch to make it match the Name field value using a lowercase search term?

crmepham
  • 4,676
  • 19
  • 80
  • 155
  • curious to know why you are using the leading wildcard which is costly and keyword field which are not analyzed and you are not applying the custom analyzer which you created, anyway adding a answer which address all of these issues – Amit Jun 22 '20 at 14:12
  • Can you add a sample document which you are expecting to match? – Gibbs Jun 22 '20 at 14:23

1 Answers1

1

As mentioned in the comment there are several flaws in the current approach and as you have not mentioned your use case, I would suggest reading my SO answer which explains various functional and non-functional requirements which you should consider.

In your case, I am adding index time approach using the ngram analyzer which can be changed to edge ngram if you need prefix kind of partial search.

Index mapping

{
  "settings": {
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "ngram",
          "min_gram": 1,
          "max_gram": 10
        }
      },
      "analyzer": {
        "autocomplete": { 
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    },
    "index.max_ngram_diff": 5 // note this
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete", // note this
        "search_analyzer": "standard" // note this
      }
    }
  }
}

Index sample docs

{
  "title" : "Example movie"
}

Search with Example

{
    "query": {
        "match" : {
            "title" : "Example"
        }
    }
}

Result

"hits": [
      {
        "_index": "testpartial",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.471659,
        "_source": {
          "title": "Example movie"
        }
      }
    ]

Search with small letter example also produces the same result, just change the search term in previous query.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Amit
  • 30,756
  • 6
  • 57
  • 88
  • What if he is searching for `ExampleDocument`? – Gibbs Jun 22 '20 at 14:22
  • @Gibbs my query will bring this as well :), OP didn't add the sample and but I verified your test case and it passed :) – Amit Jun 22 '20 at 14:24
  • This won't work if the search term is `movie` or `mov` though? – crmepham Jun 22 '20 at 16:24
  • @crmepham if you have a document which has `movie` then it works for both `movie` and `mov` and I just tested that as well, Can I request you to try my solution and let me know if something doesn't work, also get you get a chance to go through the links I provided in my answer? – Amit Jun 23 '20 at 05:31