I'm trying to calculate the Levenshtein distance between two Pandas columns but I'm getting stuck Here is the library I'm using. Here is a minimal, reproducible example:
import pandas as pd
from textdistance import levenshtein
attempts = [['passw0rd', 'pasw0rd'],
['passwrd', 'psword'],
['psw0rd', 'passwor']]
df=pd.DataFrame(attempts, columns=['password', 'attempt'])
password attempt
0 passw0rd pasw0rd
1 passwrd psword
2 psw0rd passwor
My poor attempt:
df.apply(lambda x: levenshtein.distance(*zip(x['password'] + x['attempt'])), axis=1)
This is how the function works. It takes two strings as arguments:
levenshtein.distance('helloworld', 'heloworl')
Out[1]: 2