0

I have multymatch query over several fields with several terms:

{
    "multi_match": {
        "query": ["екатеринбург", "тимирязева"],
        "fields": [
            "admin0_name^1.0", 
            "admin0_alternate_names^0.95", 
            "local_admin_name^0.6", 
            "locality_name^1.2", 
            "locality_alternate_names^1.15", 
            "neighborhood_name^0.3", 
            "street_name^1.4", 
            "housenumber^1.4", 
            "housenumber_exact^1.5", 
            "name.text^2.0"],
        "type": "most_fields",
        "_name": "main_search_query"
    }
}

Term екатеринбург should match locality_name and тимирязева should match street_name and name.text.

But query explanations shows me that only тимирязева was matched:

    1.0168997 = (MATCH) product of:
      5.0844984 = (MATCH) sum of:
        3.4778461 = (MATCH) weight(name.text:тимирязева^2.0 in 233899) [PerFieldSimilarity], result of:
          3.4778461 = score(doc=233899,freq=2.0), product of:
            0.39484683 = queryWeight, product of:
              2.0 = boost
              9.965216 = idf(docFreq=184, maxDocs=1447823)
              0.019811254 = queryNorm
            8.808089 = fieldWeight in 233899, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              9.965216 = idf(docFreq=184, maxDocs=1447823)
              0.625 = fieldNorm(doc=233899)
        1.6066521 = (MATCH) weight(street_name:тимирязева^1.4 in 233899) [PerFieldSimilarity], result of:
          1.6066521 = score(doc=233899,freq=2.0), product of:
            0.22453468 = queryWeight, product of:
              1.4 = boost
              8.095495 = idf(docFreq=1199, maxDocs=1447823)
              0.019811254 = queryNorm
            7.1554747 = fieldWeight in 233899, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.095495 = idf(docFreq=1199, maxDocs=1447823)
              0.625 = fieldNorm(doc=233899)
      0.2 = coord(2/10)

To be sure that екатеринбург matches locality_name by it self, I've combined that query with term query as must sections of bool query. It matches.

If I'll change the order of terms inside query array to ["тимирязева", "екатеринбург"] situation changes to exactly opposing, екатеринбург matches locality_name but тимирязева doesn't match street_name.

Looks like only the last term taken into account by multi_match.

N.b. I use pretty old version of ES: 1.4 is that a bug or I get the way how multi_match works wrong?

I can follow with workaround: use query as string not as query, but I'm interested why pretokenized approach fails.

dkiselev
  • 890
  • 8
  • 20
  • "Term екатеринбург should match locality_name and тимирязева should match street_name and name.text" that sounds more like a bool query where you search for the right term in the right field and combine it. Though the syntax has changed in current versions. I'd strongly encourage you to upgrade to a current version (the documentation of those unsupported versions is not even published any more). – xeraa Apr 14 '17 at 13:28
  • @xeraa I don't know exact fields which term from the array should match. So the query will be `locality_name:"Екатеринбург Ленина"` and `street_name:"Екатеринбург Ленина"` but for `locality_name` only `Екатеринбург` will match. And docs says that multi_match with cross_fields is made exactly for that case. The issue with boolean query is how the score will be counted: i run exactly into the issue mentioned in n.b. here https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#type-cross-fields – dkiselev Apr 18 '17 at 00:11

1 Answers1

1

Try with this , might be help you :-

Query_string is more powerful than multi_match .link

{
  "query": {
    "query_string": {
       "fields" : ["admin0_name*", 
            "admin0_alternate_names*", 
            "local_admin_name*", 
            "locality_name*", 
            "locality_alternate_names*", 
            "neighborhood_name*", 
            "street_name*", 
            "housenumber*", 
            "housenumber_exact*", 
            "name.text*"] ,
      "query": "*екатеринбург*" OR "*тимирязева*"
    }
  }
}

Here is my analysis , how query_string is more powerful in partial search link

Community
  • 1
  • 1
Vijay
  • 4,694
  • 1
  • 30
  • 38
  • I'll give that a try, looks like very powerfull construction, but my question isn't about workaround exactly, but more about what's wrong with how I made `multi_match` – dkiselev Apr 13 '17 at 13:28
  • chech this link , it tells difference between multi match abd query_string http://stackoverflow.com/questions/15423033/multi-field-multi-word-match-without-query-string/15430238#15430238 – Vijay Apr 13 '17 at 14:29
  • That's a powerfull feature, but I can't find a word about how the relevance will be counted for `query_string`. And I think it will be the same as boolean query with some subequerries. But I want to use blended term frequencies how it described here https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#type-cross-fields If I can blend terms frequencies, `boolean_query` or `query_string` would fit me. – dkiselev Apr 18 '17 at 00:28