0

I have in my documents a state with value "OK".

The following match query returns me this document:

POST /bank/_search
{
    "query": {  
        "bool" : {
        "must" : {
            "match" : { "state" : "OK" }
        } 
}}}

The following term query does not return me the document with state "OK":

POST /bank/_search
{
    "query": {  
        "bool" : {
        "must" : {
            "term" : { "state" : "OK" }
        } 
}}}

As per definition "The term query finds documents that contain the exact term specified in the inverted index." Still i am confused why term query does not return the desired document.

I imported the data through sense by executing following command:

curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary "@accounts.json"
curl 'localhost:9200/_cat/indices?v'

It would be great if someone can also share some info about inverted/non-inverted, analyzed/non-analyzed and term/match. I read about these but i am still confused.

Sahil Sharma
  • 3,847
  • 6
  • 48
  • 98

1 Answers1

2

Match query always perform analysis on the search text before doing its matching, but Term query looks for the exact match. Means

When you insert a text "OK", by default ES does analysis(Standard Analyzer) and stores the text as "ok" (lowercased) in Inverted Index.

so When you search with Match query

POST /bank/_search
{
    "query": {  
        "bool" : {
        "must" : {
            "match" : { "state" : "OK" }
        } 
}}}

The "OK" text converts to "ok"(as per the state analyzer) and perform matching.

For Term Query you have to manually change the text to lower cased, because its not performing analysis on search time.

POST /bank/_search
{
    "query": {  
        "bool" : {
        "must" : {
            "term" : { "state" : "ok" } //lowercased
        } 
}}}

If you always look for "OK", then you can add "Not Analyzed" on state property. That means at storing time the state property will not analysed and stored as it is, then your term and match query will look exact word.

Please look, How to not-analyze in ElasticSearch?

Community
  • 1
  • 1
Linoy
  • 1,363
  • 1
  • 14
  • 29
  • Thats strange. The original json has "state" as "OK" and still only lowercased "ok" returns the document. What is the reason for this? If text was set to be not analyzed, then i assume "OK" would have returned the document and "ok" wouldn't, correct? – Sahil Sharma Sep 18 '16 at 17:34
  • One more thing, how is match query returning doc when i input "Ok" or "oK"? Does it change every input to small chars i.e. "ok" and then check? Or inverted index has "OK", "ok", "Ok", "oK" i.e. all combinations? – Sahil Sharma Sep 18 '16 at 17:39
  • Yes, You are right. Match query always checks the analyzer added to the property(Here state property) if not Standard Analyzer and then perform analyzer on the search text. The behaviour of Standard Analyzer is lowercasing the text. So what ever text you added(Ok,ok,oK etc) every thing will be lowercased and searched – Linoy Sep 18 '16 at 17:43
  • Look at https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html Here the StandardAnalyzer contains 'lowercase token filter' which is doing this magic – Linoy Sep 18 '16 at 17:46