compute similarity between strings based on primary similarity

Question

I have two string like this:

String1: EnableAdvertResult 
String2:AdvertisementDel

Then I have splited them like this:

 X[0]=Enable X[1]=Advert X[2]=Result

 Y[0]=Advertisement Y[1]=Del

And then compute similarity between each pair elements, like this:

sim(x[0],y[0])=a, 
sim(x[0],y[1])=b, 
sim(x[1],y[0])=c, 
sim(x[1],y[1])=d, 
sim(x[2],y[0])=e, 
sim(x[2],y[1])=f

Now I want to know that, what is the best way to compute similarity between string1 and string2 based on above sim?

Your question is more related to algorithms. A related question is http://stackoverflow.com/questions/653157/a-better-similarity-ranking-algorithm-for-variable-length-strings — Mihai8, Mar 07 '13 at 15:47

score 0 · Answer 1 · answered Mar 07 '13 at 15:47

0

it is called Levenshtein distance . a C# code can be found at Levenshtein distance c# . I'm sure you can find java code too.

answered Mar 07 '13 at 15:47

urlreader

6,319
7
57
91

score 0 · Answer 2 · edited May 23 '17 at 11:43

0

You want the Levenshtein distance between strings, which is implemented in Apache StringUtils. I've used the Apache version of Levenshtein with good results. Also see this Stackoverflow article about string comparisons.

edited May 23 '17 at 11:43

Community

1
1

answered Mar 07 '13 at 15:51

Michael Shopsin

2,055
2
24
43

compute similarity between strings based on primary similarity

2 Answers2