Does a library or even standard API call exist that allows me to diff two strings and get the number of diff chars as an int? I wouldn't mind other features, just as long as I can get a more programmatic diff result (such as an int) instead of something that just outputs the entire human readable diff.
-
Hmmm, how would you decide if characters are different? e.g. if two strings are different lengths would the characters of the longer string count as different characters or the same? Is the order of the characters important? – Tarski Apr 15 '10 at 16:01
-
and what about comparing "`steves car`" with "`steve's car`". would that be 1 character different (just the "`'`") or 6 characters different (the whole "`'s car`")? i think there are several different ways of specifying this problem. – Kip Apr 15 '10 at 16:03
-
Do you need the exact number of different characters? compareTo does something similar but in a lexicographic order and returns an int. – Searles Apr 15 '10 at 16:09
-
@tarski: If the strings are longer I want to know about it. Order can be realigned, much like in beyond and compare. @kip: In that example, just 1 char different... so the int returned would be 1. – Zombies Apr 15 '10 at 18:16
2 Answers
I think what you want is the Leveshtein distance - this tells you how many changes (insertions, deletions or replacements) are required to transform one string to another.
For example, the difference between abcde
and abcdef
is 1, because you insert f
after the last position in abcde
to get abcdef
.
The difference between abcde
and abcdf
is also 1, since you replace e
in the first string with f
to get the second.
The difference between abcde
and abde
is 1 because you delete c
in the first string to get the second.
A very good implementation can be found in Apache Commons Text: LevenshteinDistance.
Here are some sample implementation in Java.
I don't know of any standard API calls, but you could see this question for references to third-party libraries (not surprising - Google, Apache Commons ...)
How to perform string Diffs in Java?
How to perform string Diffs in Java?
-
Ah, I see a StringUtils.difference(str1, str2).length() and StringUtils.difference(str2, str1).length() Should work just fine. Thanks. – Zombies Apr 15 '10 at 17:11