I have dataframe like this:
apple aple apply
apple 0 0 0
aple 0 0 0
apply 0 0 0
I want to calculate string distance e.g apple -> aple etc. My end result is here:
apple aple apply
apple 0 32 14
aple 32 0 30
apply 14 30 0
Currently this is code i am using (but it's very slow for big data):
columns = df.columns
for r in columns:
for c in columns:
m[r][c] = Simhash(r).distance(Simhash(c))
can anyone help me to calculate distance efficiently ?