0

I need a query using match_phrase along with fuzzy matching. However I'm not able to find any documentation to construct such a query. Also, when I try combining the queries(one within another), it throws errors. Is it possible to construct such a query?

Ben Abey
  • 149
  • 2
  • 9
  • [multi_match](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html) might work since it accepts both a phrase type query as well as fuzzyness, though there's a chance the phrase query also accepts fuzzyness since it basically extends the match query – apokryfos Nov 29 '18 at 14:23
  • hey @apokryfos multi_match doesn't support fuzzy with match phrase as mentioned in this link https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#phrase-fuzziness. I think in ES 6.x versions, the only way to implement `fuzzy` search using `match_phrase` is to make use of Span Queries. If it is a single field fuzzy search we can make use of fuzzy query as mentioned in this link: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html – Kamal Kunjapur Nov 29 '18 at 18:34

1 Answers1

1

You would need to make use of Span Queries.

The below query would perform phrase match+fuzzy query for champions league say for e.g. on a sample field name which is of type text

If you'd want multiple fields, then add another must clause.

Notice I've mentioned slop:0 and in_order:true which would do exact phrase match, while you achieve fuzzy behaviour using fuzzy queries inside match query.

Sample Documents

POST span-index/mydocs/1
{
  "name": "chmpions leage"
}

POST span-index/mydocs/2
{
  "name": "champions league"
}

POST span-index/mydocs/3
{
  "name": "chompions leugue"
}

Span Query:

POST span-index/_search
{  
   "query":{  
      "bool":{  
         "must":[  
            {  
               "span_near":{  
                  "clauses":[  
                     {  
                        "span_multi":{  
                           "match":{  
                              "fuzzy":{  
                                 "testField":"champions"
                              }
                           }
                        }
                     },
                     {  
                        "span_multi":{  
                           "match":{  
                              "fuzzy":{  
                                 "testField":"league"
                              }
                           }
                        }
                     }
                  ],
                  "slop":0,
                  "in_order":true
               }
            }
         ]
      }
   }
}

Response:

{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "span-index",
        "_type": "mydocs",
        "_id": "2",
        "_score": 0.5753642,
        "_source": {
          "name": "champions league"
        }
      },
      {
        "_index": "span-index",
        "_type": "mydocs",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "name": "chmpions leage"
        }
      },
      {
        "_index": "span-index",
        "_type": "mydocs",
        "_id": "3",
        "_score": 0.5753642,
        "_source": {
          "name": "chompions leugue"
        }
      }
    ]
  }
}

Let me know if this helps!

Kamal Kunjapur
  • 8,547
  • 2
  • 22
  • 32
  • So we need to divide the query like "champions league" to ["champions", "league"] then form a DSL query? – StoneLam Apr 03 '19 at 01:25
  • @StoneLam yes that's correct. You can see how the query is constructed for every word. In a way Span Queries, although much verbose and longer, is more flexible. – Kamal Kunjapur Apr 03 '19 at 20:58
  • Thanks @Karmal, I try this solution but the fuzzy query makes "best car" hits "best cat". Still a long way to go. – StoneLam Apr 08 '19 at 11:13