-1

Is there is any inbuilt function in Python Which performs like Ngram.Compare('text','text2') String Comparison.I don't want to install N-gram module.I tried all the Public and Private Functions which i got by doing dir('text')

I want to get a percentage Match on comparison of two strings.

jamylak
  • 128,818
  • 30
  • 231
  • 230
saun jean
  • 739
  • 3
  • 12
  • 25

2 Answers2

6

You want the Levenshtein distance which is implemented through

http://pypi.python.org/pypi/python-Levenshtein/

Not wanting to install something means: you have to write the code yourself.

http://en.wikipedia.org/wiki/Levenshtein_distance

5

difflib in the standard library.

You can also do a Levenshtein distance:

def lev(seq1, seq2):
    oneago = None
    thisrow = range(1, len(seq2) + 1) + [0]
    for x in xrange(len(seq1)):
        twoago, oneago, thisrow = oneago, thisrow, [0] * len(seq2) + [x + 1]
        for y in xrange(len(seq2)):
            delcost = oneago[y] + 1
            addcost = thisrow[y - 1] + 1
            subcost = oneago[y - 1] + (seq1[x] != seq2[y])
            thisrow[y] = min(delcost, addcost, subcost)
    return thisrow[len(seq2) - 1]

def di(seq1,seq2):
    return float(lev(seq1,seq2))/min(len(seq1),len(seq2))

print lev('spa','spam')
print di('spa','spam')