1

I have two string like this:

String1: EnableAdvertResult 
String2:AdvertisementDel

Then I have splited them like this:

 X[0]=Enable X[1]=Advert X[2]=Result

 Y[0]=Advertisement Y[1]=Del

And then compute similarity between each pair elements, like this:

sim(x[0],y[0])=a, 
sim(x[0],y[1])=b, 
sim(x[1],y[0])=c, 
sim(x[1],y[1])=d, 
sim(x[2],y[0])=e, 
sim(x[2],y[1])=f

Now I want to know that, what is the best way to compute similarity between string1 and string2 based on above sim?

Abimaran Kugathasan
  • 31,165
  • 11
  • 75
  • 105
  • Your question is more related to algorithms. A related question is http://stackoverflow.com/questions/653157/a-better-similarity-ranking-algorithm-for-variable-length-strings – Mihai8 Mar 07 '13 at 15:47

2 Answers2

0

it is called Levenshtein distance . a C# code can be found at Levenshtein distance c# . I'm sure you can find java code too.

urlreader
  • 6,319
  • 7
  • 57
  • 91
0

You want the Levenshtein distance between strings, which is implemented in Apache StringUtils. I've used the Apache version of Levenshtein with good results. Also see this Stackoverflow article about string comparisons.

Community
  • 1
  • 1
Michael Shopsin
  • 2,055
  • 2
  • 24
  • 43