12

The point in removing trailing whitespace is that if everyone does it always then you end up with a diff that is minimal, ie. it consists only of code changes and not whitespace changes.

However when working with other people who do not practice this, removing all trailing whitespace with your editor or a pre-commit hook results in an even worse diff. You are doing the opposite of your intention.

So I am asking here if there is a tool that I can run manually before I commit that unstages lines from staging that are only changes in whitespace.

Also a bonus would be to change the staged line to have trailing whitespace removed for lines that have code changes.

Also a bonus would be to not do this to Markdown files (as trailing space has meaning in Markdown).

I am asking here as I fully intend to write this tool if it doesn't already exist.

Jon Seigel
  • 12,251
  • 8
  • 58
  • 92
mxcl
  • 26,392
  • 12
  • 99
  • 98
  • Interesting question. Sounds like a useful tool if it doesn't already exist. Not a terribly tricky parsing problem, but do let us know what you end up doing. – Paul McMillan Nov 17 '09 at 20:47
  • Just FYI, bumping your git version may make whitespace issues much more managable. It worked for me. See my question here: http://stackoverflow.com/questions/1316364/git-whitespace-woes *smiles* – Kzqai Nov 17 '09 at 23:48
  • @Max The behavior described in my answer is in the v1.7.0 release: http://www.kernel.org/pub/software/scm/git/docs/RelNotes-1.7.0.txt – Greg Bacon Feb 13 '10 at 18:08

3 Answers3

9

The following will get you most of the way there:

$ clean=`git diff --cached -b`; \
  git apply --cached <(git diff --cached -R); \
  echo "$clean" | git apply --cached -; \
  clean=

For releases of git prior to 1.7.0, it fails if one or more files have all-whitespace changes. For example

$ git diff --cached -b
diff --git a/file1 b/file1
index b2bd1a5..3b18e51 100644
diff --git a/file2 b/file2
new file mode 100644
index 0000000..092bfb9
--- /dev/null
+++ b/file2
[...]

The empty delta (of file1 above, which really ought to be suppressed) makes git-apply unhappy:

fatal: patch with only garbage at line 3

UPDATE: The 1.7.0 release of git fixes this issue.

Say our repository is in the following state:

$ git diff --cached
diff --git a/foo b/foo
index 3b18e51..a75018e 100644
--- a/foo
+++ b/foo
@@ -1 +1,2 @@
-hello world
+hello  world
+howdy also

We could then run the above commands to fork the index and work tree:

$ git diff --cached
diff --git a/foo b/foo
index 3b18e51..1715a9b 100644
--- a/foo
+++ b/foo
@@ -1 +1,2 @@
 hello world
+howdy also

$ git diff 
diff --git a/foo b/foo
index 1715a9b..a75018e 100644
--- a/foo
+++ b/foo
@@ -1,2 +1,2 @@
-hello world
+hello  world
 howdy also

If all changes are whitespace-only, you'll see

error: No changes

I suspect fixing the index and leaving undesired changes in the work tree would surprise or even irritate most users, but that's the behavior the question asked for.

Greg Bacon
  • 134,834
  • 32
  • 188
  • 245
4

My solution on git 1.7.2.5 was as follows (starting without any changes staged):

git diff -w > temp.patch
git stash
git apply --ignore-space-change --ignore-whitespace temp.patch
# tidy up:
rm temp.patch
git stash drop

This leaves your repo back in the starting state with any whitespace only changes removed.

You can then stage your changes as usual.

ErichBSchulz
  • 15,047
  • 5
  • 57
  • 61
  • This is especially usefull when you've made two changes that should be split by "git add -p" *and* you've made whitespace changes too. – MKaras Aug 16 '14 at 14:45
2

I'd been using a script based on @GregBacon's answer for years, but at some point git changed the output of git diff -b so that the whitespace changes it didn't show as changes were included in the patch context lines in the after state rather than the before state.

I found if I reversed the clean diff and the corresponding apply, the script would work again.

So my complete script used to look like this:

#!/bin/bash

clean=`git diff --cached -b`
git apply --cached <(git diff --cached -R)
echo "$clean" | git apply --cached -
clean=

And now looks like:

#!/bin/bash

clean=`git diff --cached -b -R`
git apply --cached <(git diff --cached -R)
echo "$clean" | git apply --cached -R -
clean=

I also have a script to leave only whitespace changes in a commit. This is useful if while you fix something you make some whitespace fixes elsewhere in the file, and you want to commit the whitespace fixes first in a separate commit. It is just:

git apply --cached -R <(git diff --cached -w)
Community
  • 1
  • 1
rjmunro
  • 27,203
  • 20
  • 110
  • 132