36

I have the same HTML file rendered in two different ways and want to compare it using git diff, taking care of ignoring every white-space, tab, line-break, carriage-return, or anything that is not strictly the source code of my files.

I'm actually trying this:

git diff --no-index --color --ignore-all-space <file1> <file2>

but when some html tags are collapsed all on one line (instead of one per line and tabulated) git-diff detect is as a difference (while for me it is not).

<html><head><title>TITLE</title><meta ......

is different from

<html>
    <head>
        <title>TITLE</title>
        <meta ......

What option do I miss to accomplish what I need and threat as if it was the same?

Kamafeather
  • 8,663
  • 14
  • 69
  • 99

4 Answers4

35

git diff supports comparing files line by line or word by word, and also supports defining what makes a word. Here you can define every non-space character as a word to do the comparison. In this way, it will ignore all spaces including white-spcae, tab, line-break and carrige-return as what you need.

To achieve it, there's a perfect option --word-diff-regex, and just set it --word-diff-regex=[^[:space:]]. Refer to doc for detail.

git diff --no-index --word-diff-regex=[^[:space:]] <file1> <file2>

Here's an example. I created two files, with a.html as follows:

<html><head><title>TITLE</title><meta>

With b.html as follows:

<html>
    <head>
        <title>TI==TLE</title>
        <meta>

By running

git diff --no-index --word-diff-regex=[^[:space:]] a.html b.html

It highlights the difference of TITLE and TI{+==+}TLE in the two files in plain mode as follows. You can also specify --word-diff=<mode> to display results in different modes. The mode can be color, plain, porcelain and none, and with plain as default.

diff --git a/d.html b/a.html
index df38a78..306ed3e 100644
--- a/d.html
+++ b/a.html
@@ -1 +1,4 @@
<html>
    <head>
            <title>TI{+==+}TLE</title>
                    <meta>
Landys
  • 7,169
  • 3
  • 25
  • 34
14

Executing command git diff --help gives some options like

--ignore-cr-at-eol
    Ignore carriage-return at the end of line when doing a comparison.

--ignore-space-at-eol
    Ignore changes in whitespace at EOL.

-b, --ignore-space-change
    Ignore changes in amount of whitespace. This ignores whitespace at line end, and considers all other sequences of one or more whitespace
    characters to be equivalent.

-w, --ignore-all-space
    Ignore whitespace when comparing lines. This ignores differences even if one line has whitespace where the other line has none.

--ignore-blank-lines
    Ignore changes whose lines are all blank.

Which you can combine according to your need, Below command worked for me

git diff --ignore-blank-lines --ignore-all-space --ignore-cr-at-eol
Naresh Joshi
  • 4,188
  • 35
  • 45
6

This does the trick for me:

git diff --ignore-blank-lines
Big McLargeHuge
  • 14,841
  • 10
  • 80
  • 108
0

git-diff compares files line by line

It checks the first line of your file1 with that in file2, since they are not same it reports an error.

Ignoring white space means that foo bar will match foobar if on the same line. Since your files span multiple lines in one and only one line in other, the files will always differ

If you really want to check that the files contain the exact same non-whitespace characters, you could try something like this:

diff <(perl -ne 's/\s*//xg; print' file1) <(perl -ne 's/\s*//g; print' file2)

Hope it solves your problem!

Ken Williams
  • 22,756
  • 10
  • 85
  • 147
Mudassir Razvi
  • 1,783
  • 12
  • 33
  • It is returning me an error `Substitution replacement not terminated at -e line 1`. And collapse all my file, even when I just wanted to remove the spaces/tabs. It is not solving my problem, and is using `diff` instead of `git-diff` (that has options like *--ignore-all-spaces* and *--color*). But thank you for the clarification; your are right: it will compare lines not files or words. – Kamafeather Sep 23 '14 at 13:04
  • The `'not terminated'` error was a missing slash - fixed. – Ken Williams Sep 12 '16 at 20:25