1

I would like to use the Fuzzy Search Feature of Solr. In my dataset, I have one record that looks like this:

{
  "lastName": "John Doe"
}

I would like to perform multiple fuzzy searcheas with the following strings:

  1. John D
  2. John Do
  3. John Doe
  4. John Deo
  5. John Xeo

I perform the query like this:

  1. lastName:"John D"~
  2. lastName:"John Do"~
  3. lastName:"John Doe"~
  4. lastName:"John Deo"~
  5. lastName:"John Xeo"~

I expect, that query 1, 2, 3 and 4 return the record. Unfortunately, only query 3 returns it. As I understand from the documentation, it would be possible to specify the maximum number of edits allowed, when I don't specify something, the edit distance of 2 is used. I think I'm using the syntax incorrectly because if I take a look at my query it looks a lot like a Proximity Search.

But how can I fuzzy search for a string that contains spaces without using a proximity search?

Mirco Widmer
  • 2,139
  • 1
  • 20
  • 44
  • 2
    What is the field type? If you want to perform fuzzy matches like that, you probably want to keep it as a single token and not indexed as separate tokens - BUT, there is a way around that - you can use the complex query parser - it'll allow you to specify `inOrder` and apply a fuzzy match against each separate token by itself. https://lucene.apache.org/solr/guide/8_6/other-parsers.html#complex-phrase-query-parser – MatsLindh Oct 13 '20 at 07:46

1 Answers1

0

My problem seems to be, that I indeed executed a Proximity Search.

  1. lastName:John\ D~
  2. lastName:John\ Do~
  3. lastName:John\ Doe~
  4. lastName:John\ Deo~
  5. lastName:John\ Xeo~

works exactly like I intend. I have to make sure, all the special characters listed here https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html are escaped properly.

Mirco Widmer
  • 2,139
  • 1
  • 20
  • 44