57

I need to perform Diffs between Java strings. I would like to be able to rebuild a string using the original string and diff versions. Has anyone done this in Java? What library do you use?

String a1; // This can be a long text
String a2; // ej. above text with spelling corrections
String a3; // ej. above text with spelling corrections and an additional sentence

Diff diff = new Diff();
String differences_a1_a2 = Diff.getDifferences(a,changed_a);
String differences_a2_a3 = Diff.getDifferences(a,changed_a);    
String[] diffs = new String[]{a,differences_a1_a2,differences_a2_a3};
String new_a3 = Diff.build(diffs);
a3.equals(new_a3); // this is true
kc2001
  • 5,008
  • 4
  • 51
  • 92
Sergio del Amo
  • 76,835
  • 68
  • 152
  • 179

9 Answers9

54

This library seems to do the trick: google-diff-match-patch. It can create a patch string from differences and allow to reapply the patch.

edit: Another solution might be to https://code.google.com/p/java-diff-utils/

user7610
  • 25,267
  • 15
  • 124
  • 150
bernardn
  • 1,701
  • 5
  • 19
  • 23
27

Apache Commons has String diff

org.apache.commons.lang.StringUtils

StringUtils.difference("foobar", "foo");
Paul Whelan
  • 16,574
  • 12
  • 50
  • 83
  • 6
    It returns the remainder of the second String, starting from where it's different from the first. Which is not efficient enough for me since i would be working with big texts. See: StringUtils.difference("ab", "abxyz") -> "xyz" StringUtils.difference("ab", "xyzab") -> "xyzab"; – Sergio del Amo Sep 25 '08 at 10:33
  • 2
    Also beware this gotcha: `StringUtils.difference("abc", "") = ""` `StringUtils.difference("abc", "abc") = ""` – Alec Jul 04 '16 at 17:57
4

The java diff utills library might be useful.

dnaumenko
  • 591
  • 1
  • 4
  • 10
  • 3
    The repo https://github.com/bkromhout/java-diff-utils/ forked indirectly from the original GitHub repository and is better maintained. Maybe you can join forces there? – koppor Nov 24 '16 at 08:19
3

As Torsten Says you can use

org.apache.commons.lang.StringUtils;

System.err.println(StringUtils.getLevenshteinDistance("foobar", "bar"));
Paul Whelan
  • 16,574
  • 12
  • 50
  • 83
1

If you need to deal with differences between big amounts of data and have the differences efficiently compressed, you could try a Java implementation of xdelta, which in turn implements RFC 3284 (VCDIFF) for binary diffs (should work with strings too).

Alexander
  • 9,302
  • 2
  • 26
  • 22
0

Use the Levenshtein distance and extract the edit logs from the matrix the algorithm builds up. The Wikipedia article links to a couple of implementations, I'm sure there's a Java implementation among in.

Levenshtein is a special case of the Longest Common Subsequence algorithm, you might also want to have a look at that.

Torsten Marek
  • 83,780
  • 21
  • 91
  • 98
0

Apache Commons Text now has StringsComparator:

StringsComparator c = new StringsComparator(s1, s2);
c.getScript().visit(new CommandVisitor<Character>() {

    @Override
    public void visitKeepCommand(Character object) {
        System.out.println("k: " + object);
    }

    @Override
    public void visitInsertCommand(Character object) {
        System.out.println("i: " + object);
    }

    @Override
    public void visitDeleteCommand(Character object) {
        System.out.println("d: " + object);
    }
});
Ahmed Ashour
  • 5,179
  • 10
  • 35
  • 56
0

I found it useful to discover, (for a regression test, where I didn't need diffing support in production) that assertj provides built-in access for java-diff-utils. See its DiffUtils, InputStream, or Diff classes, for example.

Joshua Goldberg
  • 5,059
  • 2
  • 34
  • 39
-7
public class Stringdiff {
public static void main(String args[]){
System.out.println(strcheck("sum","sumsum"));
}
public static String strcheck(String str1,String str2){
    if(Math.abs((str1.length()-str2.length()))==-1){
        return "Invalid";
    }
    int num=diffcheck1(str1, str2);
    if(num==-1){
        return "Empty";
    }
    if(str1.length()>str2.length()){
        return str1.substring(num);
    }
    else{
        return str2.substring(num);
    }

}

public static int diffcheck1(String str1,String str2)
{
    int i;
    String str;
    String strn;
    if(str1.length()>str2.length()){
        str=str1;
        strn=str2;
    }
    else{
        str=str2;
        strn=str1;
    }
    for(i=0;i<str.length() && i<strn.length();i++){
            if(str1.charAt(i)!=str2.charAt(i)){
                return i;
            }
    }
        if(i<str1.length()||i<str2.length()){
            return i;
        }

    return -1;

   }
   }
  • 7
    Untested plain text code like this almost never makes sense. Create a project on a FLOSS code hosting page and provide the code + tests there. – Kalle Richter Jun 21 '17 at 18:00