1

I am using Elastic builder npm

Using esb.termQuery(Email, "test")

Mapping:

"CompanyName": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }

Database fields:

"Email": "test@mycompany.com",
"CompanyName": "my company"

Query JSON: { term: { CompanyName: 'my' } }. or { term: { Email: 'test' } } Result :

"Email": "test@mycompany.com",
    "CompanyName": "my company"

Expectation: No result, need a full-text match, Match here is acting like 'like' or queryStringQuery.

I have 3 filters prefix, exact match, include.

Amit Rana
  • 33
  • 7
  • did you get a chance to go through my answer, looking forward to get feedback from you If my answer helped you resolve your issue, then please don't forget to upvote and accept my answer – ESCoder Oct 14 '20 at 10:09

1 Answers1

0

The standard analyzer is the default analyzer which is used if none is specified. It provides grammar based tokenization

In your example, maybe that you are not specifying any analyzer explicitly in the index mapping, therefore text fields are analyzed by default and the standard analyzer is the default analyzer for them. Refer this SO answer, to get a detailed explanation on this.

The following tokens are generated if no analyzer is defined.

POST/_analyze 

{
  "analyzer" : "standard",
  "text" : "test@mycompany.com"
}

Tokens are:

{
  "tokens": [
    {
      "token": "test",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "mycompany.com",
      "start_offset": 5,
      "end_offset": 18,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

If you want a full-text search then you can define a custom analyzer with a lowercase filter, lowercase filter will ensure that all the letters are changed to lowercase before indexing the document and searching.

The normalizer property of keyword fields is similar to analyzer except that it guarantees that the analysis chain produces a single token.

The uax_url_email tokenizer is like the standard tokenizer except that it recognises URLs and email addresses as single tokens.

Index Mapping:

{
  "settings": {
    "analysis": {
      "normalizer": {
        "my_normalizer": {
          "type": "custom",
          "filter": [
            "lowercase"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "uax_url_email"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "CompanyName": {
        "type": "keyword",
        "normalizer": "my_normalizer"
      },
      "Email": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

Index Data:

{
  "Email": "test@mycompany.com",
  "CompanyName": "my company"
}

Search Query:

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "CompanyName": "My Company"
          }
        },
        {
          "match": {
            "Email": "test"
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

Search Result:

"hits": [
      {
        "_index": "stof_64220291",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "Email": "test@mycompany.com",
          "CompanyName": "my company"
        }
      }
    ]
ESCoder
  • 15,431
  • 2
  • 19
  • 42
  • using keyword facing issues with capital letters in CompanyName. – Amit Rana Oct 06 '20 at 06:42
  • "CompanyName": "My Company" – Amit Rana Oct 06 '20 at 06:42
  • @Amit Rana do you want to make a match on capital letter i.e. `My Company` also ? – ESCoder Oct 06 '20 at 06:43
  • Fron frontend I can get capital and small both. I am converting to lowercase then using the term I am searching on Elastic. If I use keyword then no result with lowercase – Amit Rana Oct 06 '20 at 06:46
  • @Amit Rana okay, and what are the conditions on email, do you want that it should not split ? – ESCoder Oct 06 '20 at 06:46
  • I can check if a special character exists then I can use .keyword. Will that be a good solution? – Amit Rana Oct 06 '20 at 06:50
  • I want a simple full string match on all fields, Frontend can send any field with full string match. – Amit Rana Oct 06 '20 at 06:51
  • @AmitRana for email you can use `uax_url_email tokenizer`, it recognises URLs and email addresses as single tokens. Please go through my updated answer, and let me know if this was your issue ? – ESCoder Oct 06 '20 at 06:57
  • @AmitRana have added `uax_url_email tokenizer`, in the index mapping as well. Please go through my answer, and let me know if this was your issue :) – ESCoder Oct 06 '20 at 07:04
  • Thank you for your answer, User can insert new custom fields, and I am inserting them to elastic, When I need to insert normalized for those fields? – Amit Rana Oct 06 '20 at 07:08
  • @AmitRana sorry I forgot to update my search query according to the normalized field (mentioned in the mapping before). But now I have changed my index mapping, and you will be able to use the same search query (to get your search results). Reg your above question, you can add `"normalizer": "my_normalizer"`, in the index mapping definition of that particular field (where you want to implement case insensitive use case) – ESCoder Oct 06 '20 at 07:56
  • @AmitRana thank u for accepting my answer, can you please upvote my answer as well +1 for your question – ESCoder Oct 29 '20 at 17:10