The word similar has varied context, but looking at your examples, I am definite, you are looking for the
Match% = 2* Longest_Common_Substring(a, b) / (len(a) + len(b)) * 100
Just google for Longest Common Substring and you are sure to find loads of Python Implementation.
One such Python Implementation from Wikibook : Algorithm Implementation/Strings/Longest common substring is as follows
def longest_common_substring(s1, s2):
m = [[0] * (1 + len(s2)) for i in xrange(1 + len(s1))]
longest, x_longest = 0, 0
for x in xrange(1, 1 + len(s1)):
for y in xrange(1, 1 + len(s2)):
if s1[x - 1] == s2[y - 1]:
m[x][y] = m[x - 1][y - 1] + 1
if m[x][y] > longest:
longest = m[x][y]
x_longest = x
else:
m[x][y] = 0
return s1[x_longest - longest: x_longest]
wrapping it over with a similarity function, the result conforms to your expectation
>>> def similarity(s1, s2):
return 2. * len(longest_common_substring(s1, s2)) / (len(s1) + len(s2)) * 100
>>> similarity("abcd","abce")
75.0
>>> similarity("abcd","dcba")
25.0