6

I have a problem with a relevant search. Results of following request are very strange:

Candidate.search('martin', fields: [:first_name, :last_name], 
                           match: :word_start, misspellings: false).map(&:name)


["Kautzer Martina",
 "Funk Martin",
 "Jaskolski Martin",
 "Gutmann Martine",
 "Wiegand Martina",
 "Schueller Martin",
 "Dooley Martin",
 "Stiedemann Martine",
 "Bartell Martina",
 "Gerlach Martine",
 "Green Martina",
 "Lang Martine",
 "Legros Martine",
 "Ernser Martina",
 "Boehm Martina",
 "Green Martine",
 "Nolan Martin",
 "Schmidt Martin",
 "Hoppe Martin",
 "Macejkovic Martine",
 "Emard Martine"]

Why Martina is going earlier than Martin?

Searckick config:

searchkick language: %w(German English), word_start: [:first_name, :last_name]
rkotov93
  • 181
  • 1
  • 1
  • 10

3 Answers3

1

When using word_start, what searchkick actually does is to tokenize the chosen fields (:first_name and :last_name) using the searchkick_word_start_index analyzer. That analyzer is a custom one which uses the following edgeNGram token filter:

          searchkick_edge_ngram: {
            type: "edgeNGram",
            min_gram: 1,
            max_gram: 50
          },

So, when Kautzer Martina gets indexed, the following tokens are actually produced and indexed:

  • :first_name: m, ma, mar, mart, marti, martin, martina
  • :last_name: k, ka, kau, kaut, kautz, kautze, kautzer

Similarly, for Funk Martin:

  • :first_name: m, ma, mar, mart, marti, martin
  • :last_name: f, fu, fun, funk

As you can see, when searching for martin, both will match because both contain the token martin and they will be sorted by descending score (default). If you want to order the results differently, you can use sorting and call your search with

order: [{last_name: :asc},{first_name: :asc}]
Val
  • 207,596
  • 13
  • 358
  • 360
  • Ok, but what should I use if I want to get relevant results? In this case I need to have all records with Martin first_name in the beginning. If I sort it I will get absolutely another result: Candidate.search('martin', fields: [:first_name, :last_name], match: :word_start, misspellings: false, order: [{last_name: :asc},{first_name: :asc}]).map(&:name) ["Bartell Martina", "Boehm Martina", "Dooley Martin", "Emard Martine", "Ernser Martina", "Funk Martin", "Gerlach Martine", "Green Martina", "Green Martine", "Gutmann Martine", "Hoppe Martin", "Jaskolski Martin", ...] – rkotov93 Apr 26 '16 at 15:29
  • Then you should order by first_name first. Try it out. – Val Apr 26 '16 at 17:12
  • Did you try to change the sort order to `order: [{first_name: :asc},{last_name: :asc}]` ? – Val Apr 28 '16 at 03:34
  • This way records will be sorted by first_name firstly, but not relevant. So, for example, if I want to find by flast_name it will lead to the same problem :( – rkotov93 Apr 28 '16 at 17:41
1

Searchkick 1.4 fixes this issue. There's even a test case dedicated to this question :)

Andrew Kane
  • 3,200
  • 19
  • 40
0

Try this misspellings: {edit_distance: 0}

The problem with match: is you have to match the exact word, and caps. I hope this works.

AMANi
  • 43
  • 5