how fuzzywuzzy python works - when there is no matching words between sentences?

Question

I am using fuzzywuzzy match to find similarity between sentences.

when I compare these two sentences- 'user attempts login' and 'acceptance criteria'

fuzz.token_set_ratio('user attempts login', 'acceptance criteria')

it gives me a score of 42 .

could someone please help me understand how we get score of 42 when there are no matching words ??

Check out [When to use which fuzz function to compare 2 strings](https://stackoverflow.com/questions/31806695/when-to-use-which-fuzz-function-to-compare-2-strings) — DarrylG, Mar 23 '21 at 14:07

score 0 · Answer 1 · answered Mar 23 '21 at 17:03

Steps of the Algorithm

Token_set_ratio performs the following steps:

split sentence and remove duplicates
create three lists of
- remainder1 = words that are only in the first sentence
- remainder2 = words that are only in the second sentence
- intersection = words that are in both sentences
sort the words in the three lists and join the elements to a combined string
- sorted_remainder1
- sorted_remainder2
- sorted_intersection
join the strings in the following way:
- combined1 = <sorted_intersection><sorted_remainder1>
- combined2 = <sorted_intersection><sorted_remainder2>
calculate the following similarities:
- fuzz.ratio(sorted_intersection, combined1)
- fuzz.ratio(sorted_intersection, combined2)
- fuzz.ratio(combined1, combined2)
return the maximum of those similarities

Example

For the strings user attempts login and acceptance criteria this leads to the following result:

remainder1 = ['user', 'attempts', 'login']
remainder2 = ['acceptance', 'criteria']
intersection = []
sorted_remainder1 = 'attempts login user'
sorted_remainder2 = 'acceptance criteria'
combined1 = 'attempts login user'
combined2 = 'acceptance criteria'

fuzz.ratio(sorted_intersection, combined1) = 0
fuzz.ratio(sorted_intersection, combined2) = 0
fuzz.ratio(combined1, combined2) = 42

In your specific case this is a similar result to fuzz.token_sort_ratio, which only sorts the words in both sentences and compares them using fuzz.ratio.

how fuzzywuzzy python works - when there is no matching words between sentences?

1 Answers1

Steps of the Algorithm

Example