I want to find string similarity between two strings. en.wikipedia has examples of some of them. code.google has a Python implementation of Levenshtein distance.
Is there a better algorithm, (and hopefully a Python library), under these constraints:
- I want to do fuzzy matches between strings. eg matches('Hello, All you people', 'hello, all You peopl') should return True
- False negatives are acceptable, False positives, except in extremely rare cases are not.
- This is done in a non realtime setting, so speed is not (much) of concern.
- [Edit] I am comparing multi word strings.
Would something other than Levenshtein distance(or Levenshtein ratio) be a better algorithm for my case?