Basic problem is the same as: Word-oriented completion suggester (ElasticSearch 5.x)
The separate index and the best answer doesn't suit me in that case. I have multiple fields flowing into the suggest field without knowing how many words. I built a shingle to fill the suggest field accordingly.
Mappings:
PUT test_index
{
"mappings": {
"my_type": {
"properties": {
"person": {
"type": "text"
},
"city": {
"type": "text"
},
"tags": {
"type": "keyword"
},
"suggest": {
"type": "completion"
}
}
}
}
}
The suggest field contains (self)-shingled "person", "city" and "tags"
POST test_index/my_type/_bulk
{"index":{}}
{ "person": "Michael Jackson", "city": "Far far away", "tags": "Rock", "suggest": ["michael", "michael jackson", "jackson", "far", "far away", "away", "rock", "concert"]}
{"index":{}}
{ "person": "Michelangelo Something", "city": "Any other place", "tags": "Artist", "suggest": ["michelangelo", "michelangelo something", "something", "any", "other place", "place", "artist"]}
{"index":{}}
{ "person": "Michael Middlename Jordan", "city": "Somewhere", "tags": ["Basketball", "Sport"], "suggest": ["michael", "michael middlename", "middlename", "middlename jordan", "jordan", "somewhwere", "basketball", "sport"]}
{"index":{}}
{ "person": "Robbie Williams Peterson", "city": "Far far away", "tags": ["Music", "Open Air"], "suggest": ["robbie", "robbie williams", "williams", "williams peterson", "peterson", "far", "far away", "away", "music", "open air"]}
And now searching for suggestions:
POST /test_index/_search?pretty
{
"_source": "suggest",
"suggest": {
"suggest": {
"text": "mic",
"completion": {
"field": "suggest"
}
}
}
}
And following results:
"text": "mic",
"offset": 0,
"length": 3,
"options": [
{
"text": "michael",
"_index": "test_index",
"_type": "my_type",
"_id": "AVoJHAVtjkwxBtXDegO0",
"_score": 1,
"_source": {
"suggest": [
"michael",
"michael jackson",
"jackson",
"far",
"far away",
"away",
"rock",
"concert"
]
}
},
{
"text": "michael",
"_index": "test_index",
"_type": "my_type",
"_id": "AVoJHAVtjkwxBtXDegO2",
"_score": 1,
"_source": {
"suggest": [
"michael",
"michael middlename",
"middlename",
"middlename jordan",
"jordan",
"somewhwere",
"basketball",
"sport"
]
}
},
{
"text": "michelangelo",
"_index": "test_index",
"_type": "my_type",
"_id": "AVoJHAVtjkwxBtXDegO1",
"_score": 1,
"_source": {
"suggest": [
"michelangelo",
"michelangelo something",
"something",
"any",
"other place",
"place",
"artist"
]
}
}
]
}
I need a way to deduplicate the results. One "micheal" is enough. Furthermore I was wondering why the score is always 1. No matter the result.