1

Basic problem is the same as: Word-oriented completion suggester (ElasticSearch 5.x)

The separate index and the best answer doesn't suit me in that case. I have multiple fields flowing into the suggest field without knowing how many words. I built a shingle to fill the suggest field accordingly.

Mappings:

PUT test_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "person": {
          "type": "text"
        },
        "city": {
          "type": "text"
        }, 
        "tags": {
          "type": "keyword"
        },
        "suggest": {
          "type": "completion"
        }
      }
    }
  }
}

The suggest field contains (self)-shingled "person", "city" and "tags"

POST test_index/my_type/_bulk
{"index":{}}
{ "person": "Michael Jackson", "city": "Far far away", "tags": "Rock", "suggest": ["michael", "michael jackson", "jackson", "far", "far away", "away", "rock", "concert"]}
{"index":{}}
{ "person": "Michelangelo Something", "city": "Any other place", "tags": "Artist", "suggest": ["michelangelo", "michelangelo something", "something", "any", "other place", "place", "artist"]}
{"index":{}}
{ "person": "Michael Middlename Jordan", "city": "Somewhere", "tags": ["Basketball", "Sport"], "suggest": ["michael", "michael middlename", "middlename", "middlename jordan", "jordan", "somewhwere", "basketball", "sport"]}
{"index":{}}
{ "person": "Robbie Williams Peterson", "city": "Far far away", "tags": ["Music", "Open Air"], "suggest": ["robbie", "robbie williams", "williams", "williams peterson", "peterson", "far", "far away", "away", "music", "open air"]}

And now searching for suggestions:

POST /test_index/_search?pretty
{
  "_source": "suggest",
  "suggest": {
    "suggest": {
      "text": "mic",
      "completion": {
        "field": "suggest"
      }
    }
  }
}

And following results:

        "text": "mic",
        "offset": 0,
        "length": 3,
        "options": [
          {
            "text": "michael",
            "_index": "test_index",
            "_type": "my_type",
            "_id": "AVoJHAVtjkwxBtXDegO0",
            "_score": 1,
            "_source": {
              "suggest": [
                "michael",
                "michael jackson",
                "jackson",
                "far",
                "far away",
                "away",
                "rock",
                "concert"
              ]
            }
          },
          {
            "text": "michael",
            "_index": "test_index",
            "_type": "my_type",
            "_id": "AVoJHAVtjkwxBtXDegO2",
            "_score": 1,
            "_source": {
              "suggest": [
                "michael",
                "michael middlename",
                "middlename",
                "middlename jordan",
                "jordan",
                "somewhwere",
                "basketball",
                "sport"
              ]
            }
          },
          {
            "text": "michelangelo",
            "_index": "test_index",
            "_type": "my_type",
            "_id": "AVoJHAVtjkwxBtXDegO1",
            "_score": 1,
            "_source": {
              "suggest": [
                "michelangelo",
                "michelangelo something",
                "something",
                "any",
                "other place",
                "place",
                "artist"
              ]
            }
          }
        ]
      }

I need a way to deduplicate the results. One "micheal" is enough. Furthermore I was wondering why the score is always 1. No matter the result.

Community
  • 1
  • 1
fexon
  • 41
  • 5
  • maybe one of the following helps: http://stackoverflow.com/questions/29886477/how-to-remove-duplicate-search-result-in-elasticsearch http://stackoverflow.com/questions/25448186/remove-duplicate-documents-from-a-search-in-elasticsearch http://stackoverflow.com/questions/26509045/filter-elasticsearch-results-to-contain-only-unique-documents-based-on-one-field – groo Feb 04 '17 at 17:46
  • Thank you. Unfortunately, aggregations won't work for a field with type "completion". – fexon Feb 06 '17 at 09:43
  • looks like it's by design in elastic 5 now: https://github.com/elastic/elasticsearch/issues/22912 – groo Feb 06 '17 at 11:12

0 Answers0