Do you know any strictly equivalent implementation of the PHP similar_text function in Java?
Asked
Active
Viewed 3,797 times
1
-
warm-up: http://stackoverflow.com/questions/907997/string-distance-library – miku Jan 04 '10 at 16:09
-
not exactly. The PHP similar_text is different than the levenshtein distance. From the PHP similar_text manual : "This calculates the similarity between two strings as described in Oliver [1993]. [...] Returns the number of matching chars in both strings." I cannot find any Java implementation for the Oliver similarity algorithm – Thomasleveil Jan 04 '10 at 16:19
5 Answers
1
Here is my implementation in java :
package comwebndesignserver.server;
import android.util.Log;
/*
*
* DenPashkov 2012
* http://www.facebook.com/pashkovdenis
* * PhP Similar String Implementation
* 30.07.2012
*
*/
public class SimilarString {
private String string = "" ;
private String string2 = "";
public int procent = 0 ;
private int position1 =0 ;
private int position2 =0;
// Similar String
public SimilarString(String str1, String str2){
this.string = str1.toLowerCase();
this.string2 = str2.toLowerCase();
}
public SimilarString() {
}
// Set string
public SimilarString setString(String str1, String str2){
this.string = str1.toLowerCase();
this.string2 = str2.toLowerCase();
return this ;
}
//get Similar
public int similar(){
string= string.trim() ;
string2= string2.trim();
int len_str1 = string.length() ;
int len_str2 = string2.length() ;
int max= 0;
if (string.length()>1 && string2.length()>1 ){
// iterate
for (int p=0 ; p<=len_str1; p++){
for (int q=0 ; q<=len_str2; q++){
for(int l=0 ; (p + l < len_str1) && (q + l < len_str2) && (string.charAt(l) == string2.charAt(l)); l++){
if (l>max){
max=l ;
position1 = p ;
position2 = q;
}
}
}
}
//sim * 200.0 / (t1_len + t2_len)
this.procent = max * 200 / ((string.length()) + (string2.length()) - (max) + (position2 - position1) ) - (max*string.length() ) ;
if (procent>100) procent = 100;
if (procent<0) procent = 0;
}
return this.procent ;
}
}
1
this works the same as php similar_text function as is in php_similar_str, php_similar_char, PHP_FUNCTION(similar_text) in string.c file of php sources
private float similarText(String first, String second) {
first = first.toLowerCase();
second = second.toLowerCase();
return (float)(this.similar(first, second)*200)/(first.length()+second.length());
}
private int similar(String first, String second) {
int p, q, l, sum;
int pos1=0;
int pos2=0;
int max=0;
char[] arr1 = first.toCharArray();
char[] arr2 = second.toCharArray();
int firstLength = arr1.length;
int secondLength = arr2.length;
for (p = 0; p < firstLength; p++) {
for (q = 0; q < secondLength; q++) {
for (l = 0; (p + l < firstLength) && (q + l < secondLength) && (arr1[p+l] == arr2[q+l]); l++);
if (l > max) {
max = l;
pos1 = p;
pos2 = q;
}
}
}
sum = max;
if (sum > 0) {
if (pos1 > 0 && pos2 > 0) {
sum += this.similar(first.substring(0, pos1>firstLength ? firstLength : pos1), second.substring(0, pos2>secondLength ? secondLength : pos2));
}
if ((pos1 + max < firstLength) && (pos2 + max < secondLength)) {
sum += this.similar(first.substring(pos1 + max, firstLength), second.substring(pos2 + max, secondLength));
}
}
return sum;
}

mirek
- 71
- 2
- 4
0
As for Java, your best bet might be the StringUtils class from the Apache Commons Lang library, which contains the LevensteinDistance method that the other SO posts mention.

Roman
- 3,050
- 3
- 21
- 20
-
1So you could take the longest strings length and subtract the LevensteinDistance in order to get the same number that similar_text would produce. and for the percentage result you would devide the result by the length. – lofte Jan 04 '10 at 17:23
0
- Download the source code for PHP (http://php.net/downloads.php)
- Uncompress it.
- Convert the similar_text() function in ext\standard\string.c to Java.
- Then eat some ice-cream for tea :D

Mike
- 171
- 1
- 2
-
OK so I've converted the C similar_text() to Java. I have a love / hate relationship with C lol. Converting slightly hacky pointer code (obviously to make it efficient for PHP) to Java wasn't easy (for me anyway hehe). Unfortunately the code won't fit here... now just point 4) to finish :) – Mike May 15 '10 at 13:38
-1
I think you can take a look on this post : PHP similar_text function in Javascript
That's a javascript equivalent for PHP similar_text. You only need to adapt it in Java. sorry if that's not help since I think Javascript syntax and Java has only a little difference.
At least, you know the implementation algorithm

Kevin
- 1