Solr/lucene - Name and Address search

Question

I have a Solr index with around 1 billion records. Each record has two field - name and address.

For name field, I'm using Beider Morse filter for phonetics. I also have (will create) good synonyms (like Bengaluru and Banglore) and stopwords (like Mr Mrs village town city etc) list. I'm also satisfied with the tokenizer I'm using for both these fields.

I'm not able to create a query that gives only good matching result. Can somebody provide me helpful suggestions?

Basically, I want to differentiate between no match, probable match, and exact match for a given name and address. Though this is a very subjective topic as there is a very thin boundary between these three types.

As Solr scores are relative, it is not at all recommended to have cutoff boundry based on score. What else I can do if not this?

A related question from me, in which a part of what I'm trying to do is mentioned - How to form a Solr edismax query with mutiple fields and different minimum match and boosts for different fields?

Currently not finalized any query. But will go for edismax most probably — Rajat Goel, Sep 03 '19 at 05:48

score 2 · Accepted Answer · answered Sep 03 '19 at 06:46

Have one field with the exact terms (i.e. no synonyms, no phonetics, etc.), one field with synonyms and/or phonetics, and any necessary combination for scoring. Then apply boost based on the search result profile you want.

You can also use the debug output to determine which fields generated hits (there's multiple questions about possible ways to do that), but one way is to use highlighting or the debug output to know which field matched.

Solr/lucene - Name and Address search

1 Answers1