21

Are there any libraries out there for Java that will accept two strings, and return a string with formatted output as per the *nix diff command?

e.g. feed in

test 1,2,3,4
test 5,6,7,8
test 9,10,11,12
test 13,14,15,16

and

test 1,2,3,4
test 5,6,7,8
test 9,10,11,12,13
test 13,14,15,16

as input, and it would give you

test 1,2,3,4                                                    test 1,2,3,4
test 5,6,7,8                                                    test 5,6,7,8
test 9,10,11,12                                               | test 9,10,11,12,13
test 13,14,15,16                                                test 13,14,15,16

Exactly the same as if I had passed the files to diff -y expected actual

I found this question, and it gives some good advice on general libraries for giving you programmatic output, but I'm wanting the straight string results.

I could call diff directly as a system call, but this particular app will be running on unix and windows and I can't be sure that the environment will actually have diff available.

Community
  • 1
  • 1
madlep
  • 47,370
  • 7
  • 42
  • 53

5 Answers5

15

java-diff-utils

The DiffUtils library for computing diffs, applying patches, generationg side-by-side view in Java

Diff Utils library is an OpenSource library for performing the comparison operations between texts: computing diffs, applying patches, generating unified diffs or parsing them, generating diff output for easy future displaying (like side-by-side view) and so on.

Main reason to build this library was the lack of easy-to-use libraries with all the usual stuff you need while working with diff files. Originally it was inspired by JRCS library and it's nice design of diff module.

Main Features

  • computing the difference between two texts.
  • capable to hand more than plain ascci. Arrays or List of any type that implements hashCode() and equals() correctly can be subject to differencing using this library
  • patch and unpatch the text with the given patch
  • parsing the unified diff format
  • producing human-readable differences
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
6

I ended up rolling my own. Not sure if it's the best implementation, and it's ugly as hell, but it passes against test input.

It uses java-diff to do the heavy diff lifting (any apache commons StrBuilder and StringUtils instead of stock Java StringBuilder)

public static String diffSideBySide(String fromStr, String toStr){
    // this is equivalent of running unix diff -y command
    // not pretty, but it works. Feel free to refactor against unit test.
    String[] fromLines = fromStr.split("\n");
    String[] toLines = toStr.split("\n");
    List<Difference> diffs = (new Diff(fromLines, toLines)).diff();

    int padding = 3;
    int maxStrWidth = Math.max(maxLength(fromLines), maxLength(toLines)) + padding;

    StrBuilder diffOut = new StrBuilder();
    diffOut.setNewLineText("\n");
    int fromLineNum = 0;
    int toLineNum = 0;
    for(Difference diff : diffs) {
        int delStart = diff.getDeletedStart();
        int delEnd = diff.getDeletedEnd();
        int addStart = diff.getAddedStart();
        int addEnd = diff.getAddedEnd();

        boolean isAdd = (delEnd == Difference.NONE && addEnd != Difference.NONE);
        boolean isDel = (addEnd == Difference.NONE && delEnd != Difference.NONE);
        boolean isMod = (delEnd != Difference.NONE && addEnd != Difference.NONE);

        //write out unchanged lines between diffs
        while(true) {
            String left = "";
            String right = "";
            if (fromLineNum < (delStart)){
                left = fromLines[fromLineNum];
                fromLineNum++;
            }
            if (toLineNum < (addStart)) {
                right = toLines[toLineNum];
                toLineNum++;
            }
            diffOut.append(StringUtils.rightPad(left, maxStrWidth));
            diffOut.append("  "); // no operator to display
            diffOut.appendln(right);

            if( (fromLineNum == (delStart)) && (toLineNum == (addStart))) {
                break;
            }
        }

        if (isDel) {
            //write out a deletion
            for(int i=delStart; i <= delEnd; i++) {
                diffOut.append(StringUtils.rightPad(fromLines[i], maxStrWidth));
                diffOut.appendln("<");
            }
            fromLineNum = delEnd + 1;
        } else if (isAdd) {
            //write out an addition
            for(int i=addStart; i <= addEnd; i++) {
                diffOut.append(StringUtils.rightPad("", maxStrWidth));
                diffOut.append("> ");
                diffOut.appendln(toLines[i]);
            }
            toLineNum = addEnd + 1; 
        } else if (isMod) {
            // write out a modification
            while(true){
                String left = "";
                String right = "";
                if (fromLineNum <= (delEnd)){
                    left = fromLines[fromLineNum];
                    fromLineNum++;
                }
                if (toLineNum <= (addEnd)) {
                    right = toLines[toLineNum];
                    toLineNum++;
                }
                diffOut.append(StringUtils.rightPad(left, maxStrWidth));
                diffOut.append("| ");
                diffOut.appendln(right);

                if( (fromLineNum > (delEnd)) && (toLineNum > (addEnd))) {
                    break;
                }
            }
        }

    }

    //we've finished displaying the diffs, now we just need to run out all the remaining unchanged lines
    while(true) {
        String left = "";
        String right = "";
        if (fromLineNum < (fromLines.length)){
            left = fromLines[fromLineNum];
            fromLineNum++;
        }
        if (toLineNum < (toLines.length)) {
            right = toLines[toLineNum];
            toLineNum++;
        }
        diffOut.append(StringUtils.rightPad(left, maxStrWidth));
        diffOut.append("  "); // no operator to display
        diffOut.appendln(right);

        if( (fromLineNum == (fromLines.length)) && (toLineNum == (toLines.length))) {
            break;
        }
    }

    return diffOut.toString();
}

private static int maxLength(String[] fromLines) {
    int maxLength = 0;

    for (int i = 0; i < fromLines.length; i++) {
        if (fromLines[i].length() > maxLength) {
            maxLength = fromLines[i].length();
        }
    }
    return maxLength;
}
madlep
  • 47,370
  • 7
  • 42
  • 53
0

Busybox has a diff implementation that is very lean, should not be hard to convert to java, but you would have to add the two-column functionality.

Sparr
  • 7,489
  • 31
  • 48
0

http://c2.com/cgi/wiki?DiffAlgorithm I found this on Google and it gives some good background and links. If you care about the algorithm beyond just doing the project, a book on basic algorithm that covers Dynamic Programming or a book just on it. Algorithm knowledge is always good:)

Jim Keener
  • 9,255
  • 4
  • 24
  • 24
  • Getting working implementations in Java is the easy part - they're out there. The hard bit was actually formatting the output in a format that is required. – madlep Nov 26 '08 at 05:45
0

You can use Apache Commons Text library to achieve this. This library provides 'diff' capability based on "very efficient algorithm from Eugene W. Myers".

This provides you ability to create your own visitor so that you can process the diff in the way you want & may be output to console or HTML etc. Here is one article which walks through nice & simple example to output side by side diff in HTML format using Apache Commons Text library & simple Java code.

Ravi K
  • 976
  • 7
  • 9