I have a Solr index with around 1 billion records. Each record has two field - name and address.
For name field, I'm using Beider Morse filter for phonetics. I also have (will create) good synonyms (like Bengaluru and Banglore) and stopwords (like Mr Mrs village town city etc) list. I'm also satisfied with the tokenizer I'm using for both these fields.
I'm not able to create a query that gives only good matching result. Can somebody provide me helpful suggestions?
Basically, I want to differentiate between no match, probable match, and exact match for a given name and address. Though this is a very subjective topic as there is a very thin boundary between these three types.
As Solr scores are relative, it is not at all recommended to have cutoff boundry based on score. What else I can do if not this?
A related question from me, in which a part of what I'm trying to do is mentioned - How to form a Solr edismax query with mutiple fields and different minimum match and boosts for different fields?