1

I try to implement search in elasticsearch that should find specific term in one or more fields. Term can consist of one letter as well as of several letters or several words. And this term can be located in any part of the field. It's very similar to SQL query with where part such as this:

field1 like '%term%' or field2 like '%term%' or field3 like '%term%'

I might have many indices and many types so in order to not specify mapping for every field I decided to use dynamic templates. First of all I defined analysis part

"analysis" : {
  "filter" : {
    "word_delimiter_filter" : {
      "type" : "word_delimiter"
    },
    "ngram_filter" : {
      "type" : "nGram",
      "min_gram" : "1",
      "max_gram" : "50"
    }
  },
  "analyzer" : {
    "custom_search_analyzer" : {
      "filter" : [ "lowercase", "word_delimiter_filter" ],
      "type" : "custom",
      "tokenizer" : "standard"
    },
    "ngram_analyzer" : {
      "filter" : [ "lowercase", "word_delimiter_filter", "ngram_filter" ],
      "type" : "custom",
      "tokenizer" : "standard"
    }
  }
}

It gives me capability to match a term of variable length against any part of the field: the beginning, middle or end as well as whole words.

I decided to add all searchable fields in _all field and to search a term in this only field. So my dynamic template looks like

{
  "mappings": {
    "my_type": {
      "_all" : {
        "analyzer" : "ngram_analyzer",
        "search_analyzer" : "custom_search_analyzer"
      },
      "dynamic_templates" : [{
        "searchable_fields": {
          "match_mapping_type" : "string",
          "match": "field1|field2|field3",
          "match_pattern": "regex",
          "mapping": {
            "type": "string",
            "fields": {
              "raw": {
                "type": "string",
                "index": "not_analyzed"
              }
            },
            "analyzer" : "ngram_analyzer",
            "search_analyzer" : "standard",
            "include_in_all": true
          }
        }
      }]
    }
  }
} 

It works great when I have flattened JSON with no hierarchy. The problem arises when JSON is much more complex with several nested objects and arrays of objects. In these nested objects not all fields are searchable so I need to specify what fields I want to search in. I tried regex match

"match": "nested_field1\\.field2|nested_field2.field1"

but this didn't work. Also path_match and path_unmatch accept only one path. So I wonder how to implement match against several nested objects fields in order to add them in _all field for search.

P.S. I also wonder is it an effective way to implement search with such requirements with match against _all field or it's better to use multi_match query?

Simple json looks like

  {
    "storeId": "15",
    "title": "London Store",
    "address": "15 Green Street London",
    "description": "Some description",
    "phone": "Some phone",
    "country": "UK",
    "dateCreated": 1382444820000
  }

And more complex json

{
  "orderId": "1",
  "assistant": {
    "id": "12",
    "name": "John Doe"
  },
  "store": {
    "id": "15",
    "title": "London Store",
    "address": "15 Green Street London"
  },
  "items": [{
    "title": "Some Title",
    "description": "Some description",
    "attributes": {
      "size": "L",
      "colour": "black"
    }
  }],
  "payments": [{
    "paymentNumber": "SomeNumber",
    "status": "Finished"
  }]
}
D. Joe
  • 33
  • 1
  • 5
  • what do you mean by "json is much more complex with several nested objects". If fields are part of nested objects where you assume ngram_analyzers should be applied, then you may have to define them in mappings. Can i take a look of both your simple and complex document? – user3775217 Mar 16 '17 at 12:12
  • @user3775217 I added this information to my question. – D. Joe Mar 16 '17 at 14:07
  • Similar question: https://stackoverflow.com/questions/44791075/in-elasticsearch-how-do-i-search-for-an-arbitrary-substring – Patrick Szalapski Jun 28 '17 at 02:31

0 Answers0