How to get exact match phrase more than one

Question

Below is the query to get the exact match

GET courses/_search
{
  "query": {
    "term" : {
         "name.keyword": "Anthropology 230"
      }
  }
}

I need to find the Anthropology 230 and Anthropology 250 also

How to get the exact match

but how **Anthropology 230** is `exact` match with **Anthropology 230 also** ? — A l w a y s S u n n y, Jul 16 '20 at 12:08
@aysh can u share your sample index data also? And can you please tell what is your expected result based on that index data? — ESCoder, Jul 16 '20 at 12:15
@AlwaysSunny its 250, so basically it has to match 2 values 230 and 250 — aysh, Jul 16 '20 at 12:47

A l w a y s S u n n y · Answer 1 · 2020-07-16T15:16:34.387

4

You can check and try with, match, match_phrase or match_phrase_prefix

Using match,

GET courses/_search
{
    "query": {
        "match" : {
            "name" : "Anthropology 230"
        }
    },
    "_source": "name"
}

Using match_phrase,

GET courses/_search
{
    "query": {
        "match_phrase" : {
            "name" : "Anthropology"
        }
    },
    "_source": "name"
}

OR using regexp,

GET courses/_search
{
    "query": {
        "regexp" : {
            "name" : "Anthropology [0-9]{3}"
        }
    },
    "_source": "name"
}

edited Jul 16 '20 at 15:16

answered Jul 16 '20 at 12:56

A l w a y s S u n n y

36,497
8
60
103

`match_phrase` and `match_phrase_prefix` is overkill here and can cause other issues if OP is not looking for phrase match and regex queries are expensive, glad you added `match` query option :) – Amit Jul 16 '20 at 13:59
2

@Always Sunny if you use ```match_phrase``` query, then it will not match both the documents, **it will only match one document that contains ```Anthropology 230```** – ESCoder Jul 16 '20 at 14:06
2

@Bhavya, agreed and I know that :), I had a typo, I meant **Anthropology** only, also I agree the match is better in this case as `@Ninja` mentioned – A l w a y s S u n n y Jul 16 '20 at 15:16
to downvoter: what is the reason for voting it down? – A l w a y s S u n n y Jul 17 '20 at 06:41

score 3 · Accepted Answer · edited Jul 25 '20 at 16:05

The mistake that you are doing is that you are using the term query on keyword field and both of them are not analyzed, which means they try to find the exact same search string in inverted index.

What you should be doing is: define a text field which you anyway will have if you have not defined your mapping. I am also assuming the same as in your query you mentioned .keyword which gets created automatically if you don't define mapping.

Now you can just use below match query which is analyzed and uses standard analyzer which splits the token on whitespace, so Anthropology 250 and 230 will be generated for your 2 sample docs.

Simple and efficient query which brings both the docs

{
    "query": {
        "match" : {
            "name" : "Anthropology 230"
        }
    }
}

And search result

 "hits": [
      {
        "_index": "matchterm",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.8754687,
        "_source": {
          "name": "Anthropology 230"
        }
      },
      {
        "_index": "matchterm",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.18232156,
        "_source": {
          "name": "Anthropology 250"
        }
      }
    ]

The reason why above query matched both docs is that it created two tokens anthropology and 230 and matches anthropology in both of the documents.

You should definitely read about the analysis process and can also try analyze API to see the tokens generated for any text.

Analyze API output for your text

POST http://{{hostname}}:{{port}}/{{index-name}}/_analyze

{
  "analyzer": "standard",
  "text": "Anthropology 250"
}


{
    "tokens": [
        {
            "token": "anthropology",
            "start_offset": 0,
            "end_offset": 12,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "250",
            "start_offset": 13,
            "end_offset": 16,
            "type": "<NUM>",
            "position": 1
        }
    ]
}

score 2 · Answer 3 · answered Jul 16 '20 at 14:06

Assuming you may have more 'Anthropology nnn' items, this should do what you need:

"query":{
    "bool":{
        "must":[
            {"term": {"name.keyword":"Anthropology 230"}},
            {"term": {"name.keyword":"Anthropology 250"}},
        ]  
    }
}

How to get exact match phrase more than one

3 Answers3