I am using fuzzywuzzy and rapidfuzz to find names mentioned in comments. I read through the documentation of the "token_set_ratio" function but I still don't understand the following:
# I preprocessed the comments to remove stop words and commonly mentioned other words
fuzz.token_set_ratio("reporting michael anders sven straumann guy called jonatjan smith partners","jonathan smith")
# returns 52.6
Jonathan Smith has only one spelling mistake, why is the ratio so low?
Moreover, would there be an option to overcome the problem so that Jonathan receives a higher score?
thanks for your help, Michael