0

I want to make a function that keeps track of the transformations made to make one string identical to another one
Example:
A = batyu
B = beauty
diff(A,B) has to return:
[[1,"Insert", "e"], [5, "Delete"], [3, "Insert", "u"]]\

I used Levenshtein.editops but i want to code the function that does this

  • 1
    Have you seen [How to modify Levenshtein algorithm, to know if it inserted, deleted, or substituted a character?](https://stackoverflow.com/questions/24190003/how-to-modify-levenshtein-algorithm-to-know-if-it-inserted-deleted-or-substit) The [project](https://pypi.org/project/python-Levenshtein/) you mention that you want to replicate is open source – kcsquared Apr 26 '22 at 21:53
  • Please provide enough code so others can better understand or reproduce the problem. – lpounng Apr 27 '22 at 03:02

2 Answers2

0

The wikipedia article for levenshtein distance gives you the function it uses. Now it's your turn to implement it in python.

If you have code that does not do what you expect it to, feel free to post another question detailing what you tried, what you expected and why it didn't work.

If you can read C you can also check out the implementation of editops.

Lukas Schmid
  • 1,895
  • 1
  • 6
  • 18
0

You can use the output from the example in the documentation https://docs.python.org/3/library/difflib.html#difflib.SequenceMatcher.get_opcodes :

a= 'adela'
b= 'adella'
dif = difflib.SequenceMatcher(None, a, b)
opcodes = dif.get_opcodes()
for tag, i1, i2, j1, j2 in opcodes:
    print('{:7}   a[{}:{}] --> b[{}:{}] {!r:>8} --> {!r}'.format(
        tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2]))

so get your sequencematcher object and then iterate over the opcodes and store however you want. I came across this searching for a quick link to the editops documentation. For my purpose I used this as a measure of how close the strings were:

print(len([x for x in opcodes if x[0] != 'equal']))
grantr
  • 878
  • 8
  • 16