I try to implement search in elasticsearch that should find specific term in one or more fields. Term can consist of one letter as well as of several letters or several words. And this term can be located in any part of the field. It's very similar to SQL query with where part such as this:
field1 like '%term%' or field2 like '%term%' or field3 like '%term%'
I might have many indices and many types so in order to not specify mapping for every field I decided to use dynamic templates. First of all I defined analysis part
"analysis" : {
"filter" : {
"word_delimiter_filter" : {
"type" : "word_delimiter"
},
"ngram_filter" : {
"type" : "nGram",
"min_gram" : "1",
"max_gram" : "50"
}
},
"analyzer" : {
"custom_search_analyzer" : {
"filter" : [ "lowercase", "word_delimiter_filter" ],
"type" : "custom",
"tokenizer" : "standard"
},
"ngram_analyzer" : {
"filter" : [ "lowercase", "word_delimiter_filter", "ngram_filter" ],
"type" : "custom",
"tokenizer" : "standard"
}
}
}
It gives me capability to match a term of variable length against any part of the field: the beginning, middle or end as well as whole words.
I decided to add all searchable fields in _all
field and to search a term in this only field. So my dynamic template looks like
{
"mappings": {
"my_type": {
"_all" : {
"analyzer" : "ngram_analyzer",
"search_analyzer" : "custom_search_analyzer"
},
"dynamic_templates" : [{
"searchable_fields": {
"match_mapping_type" : "string",
"match": "field1|field2|field3",
"match_pattern": "regex",
"mapping": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
},
"analyzer" : "ngram_analyzer",
"search_analyzer" : "standard",
"include_in_all": true
}
}
}]
}
}
}
It works great when I have flattened JSON with no hierarchy. The problem arises when JSON is much more complex with several nested objects and arrays of objects. In these nested objects not all fields are searchable so I need to specify what fields I want to search in. I tried regex match
"match": "nested_field1\\.field2|nested_field2.field1"
but this didn't work. Also path_match
and path_unmatch
accept only one path. So I wonder how to implement match against several nested objects fields in order to add them in _all
field for search.
P.S. I also wonder is it an effective way to implement search with such requirements with match against _all
field or it's better to use multi_match
query?
Simple json looks like
{
"storeId": "15",
"title": "London Store",
"address": "15 Green Street London",
"description": "Some description",
"phone": "Some phone",
"country": "UK",
"dateCreated": 1382444820000
}
And more complex json
{
"orderId": "1",
"assistant": {
"id": "12",
"name": "John Doe"
},
"store": {
"id": "15",
"title": "London Store",
"address": "15 Green Street London"
},
"items": [{
"title": "Some Title",
"description": "Some description",
"attributes": {
"size": "L",
"colour": "black"
}
}],
"payments": [{
"paymentNumber": "SomeNumber",
"status": "Finished"
}]
}