46

I've made a single simple change to a large number of files that are version controlled in git and I'd like to be able to check that no other changes are slipping into this large commit.

The changes are all of the form

-                       "main()",
+                       OOMPH_CURRENT_FUNCTION,

where "main()" could be the name of any function. I want to generate a diff of all changes that are not of this form.

The -G and -S options to git diff are tantalisingly close--they find changes that DO match a string or regexp.

Is there a good way to do this?

Attempts so far

Another question describes how regexs can be negated, using this approach I think the command should be

git diff -G '^((?!OOMPH_CURRENT_FUNCTION).)*$'

but this just returns the error message

fatal: invalid log-grep regex: Invalid preceding regular expression

so I guess git doesn't support this regex feature.

I also noticed that the standard unix diff has the -I option to "ignore changes whose lines all match RE". But I can't find the correct way to replace git's own diff with the unix diff tool.

Community
  • 1
  • 1
dshepherd
  • 4,989
  • 4
  • 39
  • 46
  • 3
    If you can find all the changes that do match, store them in a file and `git diff | fgrep -vxf file` – tripleee Apr 08 '13 at 12:13
  • Maybe you could also store the result of the git diff in a file, and use a better regex tool. – Loamhoof Apr 08 '13 at 12:15
  • @tripleee This worked, thanks! It's not an ideal solution but if you rewrite it as an answer I'll accept it. – dshepherd Apr 08 '13 at 12:39
  • @Loamhoof I thought of that, but I think you would need to properly parse diff output to be able to remove entire changes (rather than just removing single lines). – dshepherd Apr 08 '13 at 12:41
  • If this is a recurring problem, it would be worth the effort to set up a `.gitattribute` filter driver for the changes (http://stackoverflow.com/a/12969603/520162). Doing such, the files won't even be shown as modified if `main()` is replaced by `OOMPH_CURRENT_FUNCTION` – eckes Apr 08 '13 at 12:54
  • @tripleee Actually, having thought about the problem some more I don't think your solution is "safe". In the example above if another change happened to include the line '- "main()",' it would be hidden in the diff (while the rest of the change would still be there). This could be extremely confusing! – dshepherd Apr 08 '13 at 20:11
  • 2
    This is totally worth a feature request... – naught101 Jul 21 '17 at 01:56

4 Answers4

28

No more grep needed!

With Git 2.30 (Q1 2021), "git diff"(man) family of commands learned the "-I<regex>" option to ignore hunks whose changed lines all match the given pattern.

See commit 296d4a9, commit ec7967c (20 Oct 2020) by Michał Kępień (kempniu).
(Merged by Junio C Hamano -- gitster -- in commit 1ae0949, 02 Nov 2020)

diff: add -I<regex> that ignores matching changes

Signed-off-by: Michał Kępień

Add a new diff option that enables ignoring changes whose all lines (changed, removed, and added) match a given regular expression.
This is similar to the -I/--ignore-matching-lines option in standalone diff utilities and can be used e.g. to ignore changes which only affect code comments or to look for unrelated changes in commits containing a large number of automatically applied modifications (e.g. a tree-wide string replacement).

The difference between -G/-S and the new -I option is that the latter filters output on a per-change basis.

Use the 'ignore' field of xdchange_t for marking a change as ignored or not.
Since the same field is used by --ignore-blank-lines, identical hunk emitting rules apply for --ignore-blank-lines and -I.
These two options can also be used together in the same git invocation (they are complementary to each other).

Rename xdl_mark_ignorable() to xdl_mark_ignorable_lines(), to indicate that it is logically a "sibling" of xdl_mark_ignorable_regex() rather than its "parent".

diff-options now includes in its man page:

-I<regex>

--ignore-matching-lines=<regex>

Ignore changes whose all lines match <regex>.
This option may be specified more than once.

Examples:

git diff --ignore-blank-lines -I"ten.*e" -I"^[124-9]"

A small memleak in "diff -I<regexp>" has been corrected with Git 2.31 (Q1 2021).

See commit c45dc9c, commit e900d49 (11 Feb 2021) by Ævar Arnfjörð Bjarmason (avar).
(Merged by Junio C Hamano -- gitster -- in commit 45df6c4, 22 Feb 2021)

diff: plug memory leak from regcomp() on {log,diff} -I

Signed-off-by: Ævar Arnfjörð Bjarmason

Fix a memory leak in 296d4a9 ("diff: add -I that ignores matching changes", 2020-10-20, Git v2.30.0-rc0 -- merge listed in batch #3) by freeing the memory it allocates in the newly introduced diff_free().

This memory leak was intentionally introduced in 296d4a9, see the discussion on a previous iteration of it.

At that time freeing the memory was somewhat tedious, but since it isn't anymore with the newly introduced diff_free() let's use it.

Let's retain the pattern for diff_free_file() and add a diff_free_ignore_regex(), even though (unlike "diff_free_file") we don't need to call it elsewhere.
I think this will make for more readable code than gradually accumulating a giant diff_free() function, sharing "int i" across unrelated code etc.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • The "**all** lines match" requirement makes this somewhat fiddly. You can _minimize_ the chance that "all lines" run into other changes by tweaking the number of context lines (e.g. `--unified=1` as mentioned [by Beni](https://stackoverflow.com/a/51232525/241211)), but you can't throw out uninteresting changes if they border interesting ones. – Michael Jan 25 '21 at 16:46
  • 1
    @Michael Good point. I still think this is an interesting addition to git diff, however fiddly it is. – VonC Jan 25 '21 at 16:51
  • Of interest: https://github.com/git/git/commit/54c8a7c379fc37a847b8a5ec5c419eae171322e1 from https://github.com/git/git/commit/2da81d1efb0166e1cec7a8582b837994dde6225b – VonC Jun 08 '22 at 06:51
26

Try the following:

$ git diff > full_diff.txt
$ git diff -G "your pattern" > matching_diff.txt

You can then compare the two like so:

$ diff matching_diff.txt full_diff.txt

If all changes match the pattern, full_diff.txt and matching_diff.txt will be identical, and the last diff command will not return anything.

If there are changes that do not match the pattern, the last diff will highlight those.


You can combine all of the above steps and avoid having to create two extra files like so:

diff <(git diff -G "your pattern") <(git diff)  # works with other diff tools too
Maxim Belkin
  • 138
  • 1
  • 6
Elmar Peise
  • 14,014
  • 3
  • 21
  • 40
  • 4
    This works, and I can't see any obvious problems (apart from strange formatting of the output) so I'm accepting it. Thanks! I still think there should be a better way to do this though. – dshepherd Apr 10 '13 at 09:48
  • 1
    Smart. This is exactly the kind of solution I like given my toolset (vim). It would be even better if `git-diff` grew a negative `-G` option to mimic how GNU diff's `-x` works... :) – sehe Jul 01 '16 at 21:04
  • 1
    `-G` match the whole file, not just the section changed. It means any others changes in files that also include your unwanted change will be ignored. – Xorax Feb 03 '20 at 10:37
12

Use git difftool to run a real diff.

Example: https://github.com/cben/kubernetes-discovery-samples/commit/b1e946434e73d8d1650c887f7d49b46dcbd835a6
I've created a script running diff the way I want to (here I'm keeping curl --verbose outputs in the repo, resulting in boring changes each time I rerun the curl):

#!/bin/bash
diff --recursive --unified=1 --color \
     --ignore-matching-lines=serverAddress \
     --ignore-matching-lines='^\*  subject:' \
     --ignore-matching-lines='^\*  start date:' \
     --ignore-matching-lines='^\*  expire date:' \
     --ignore-matching-lines='^\*  issuer:' \
     --ignore-matching-lines='^< Date:' \
     --ignore-matching-lines='^< Content-Length:' \
     --ignore-matching-lines='--:--:--' \
     --ignore-matching-lines='{ \[[0-9]* bytes data\]' \
     "$@"

And now I can run git difftool --dir-diff --extcmd=path/to/above/script.sh and see only interesting changes.

An important caveat about GNU diff -I aka --ignore-matching-lines: this merely prevents such lines from making a chunk "intersting" but when these changes appear in same chunk with other non-ignored changes, it will still show them. I used --unified=1 above to reduce this effect by making chunks smaller (only 1 context line above and below each change).

Beni Cherniavsky-Paskin
  • 9,483
  • 2
  • 50
  • 58
0

I think that I have a different solution using pipes and grep. I had two files that needed to be checked for differences that didn't include @@ and g:, so I did this (borrowing from here and here and here:

$ git diff -U0 --color-words --no-index file1.tex file2.tex | grep -v -e "@@" -e "g:"

and that seemed to do the trick. Colors still were there.

So I assume you could take a simpler git diff command/output and do the same thing. What I like about this is that it doesn't require making new files or redirection (other than a pipe).

kcrisman
  • 4,374
  • 20
  • 41