I recently answered a question, that in its comments section picked up a query from another user that I couldn't answer.
Searching for a product even if code is misspelled
Given a fuzzy search parameter which will use Regular Expressions to filter a 'large' datasource, how would you go about assigning a value for 'relevance' or 'best match'?
The filter will work correctly but I have no idea how to adapt it in such a way that you can identify what values are closest to the provided search string, and what values are farthest away.
Closest in this case would be an exact match to the string (assume the '+' character doesn't exist, anything that still matches is closest). Farthest, i.e. Worst, match would be exactly the opposite, largest number of non-matching characters.
For the sake of avoiding arguments, lets assume the fuzzy search being used is using a mix of '+' and '*' in the search patter. X+HG*UPO+Z*
or something along those lines.
The goal is to avoid using a string length comparison. In the question I answered, the data was almost guaranteed to always be the same length anyway.