4

Playing around with the standard linux diff command, I could not find a way to avoid the following type of grouping in its output (the output listings here assume the unified format)

This question aims at the case that each line differs by little from its counterpart in the other file, and it's more useful to see each line next to its counterpart.

I would like instead of having groups like this show up in the comparison output:

- line 1 - line 2 - line 3 + line 1 modified + line 2 modified + line 3 modified

To get this:

- line 1 + line 1 modified - line 2 + line 2 modified - line 3 + line 3 modified

Of course, this is a convenience question as this can be accomplished by writing your own code to post-process the diff output, or diverging from the lcs algorithm with your own algorithm. I don't think variants like wdiff etc. would help much, as the plain diff -U0 output format fits my needs very well except for this grouping property, whereas wdiff introduces other aspects that are not optimal for my case.

I'm looking for a command-line way, or a library that can be used in code, not a UI tool.

matanster
  • 15,072
  • 19
  • 88
  • 167
  • 1
    If you did something as simple as putting a blank line between each real line (just for the purpose of diff), it would give you the output you seek. That could be done in numerous tools, including sed. Unfortunately, I don't know of a way to make diff read from anything other than a file... then again you could just put them in /var/tmp and then run diff – Hambone Aug 05 '14 at 20:15
  • I guess I can pipe that together. Maybe it will mess up the diff in case the files are very different. Lets see what else comes up. Thanks. – matanster Aug 05 '14 at 21:19
  • duplicate of [git diff with interleaved lines](https://stackoverflow.com/questions/22134471/git-diff-with-interleaved-lines) and [How to prevent GNU diff to group lines for patches?](https://stackoverflow.com/questions/62023493/how-to-prevent-gnu-diff-to-group-lines-for-patches) – milahu Mar 29 '22 at 15:31

1 Answers1

1

I was trying to solve this myself. The closest I go was this:

diff -y -W 10000 file1 file2 | grep '|' | sed 's/\s*|\s*/\n/g'

The one issue is that this assumes there are no "white space" difference at the beginning of the lines (or that you don't care about it).

Sebastien Diot
  • 7,183
  • 6
  • 43
  • 85
  • improved version: `W=1000; c=$(((W+1)/2)); diff -y -t -W $W file1 file2 | while read -r L; do a="$(echo "$L" | cut -c1-$((c-1)) | sed -E 's/ +$//')"; b="$(echo "$L" | cut -c$((c+3))-)"; echo "-$a"; echo "+$b"; done`. see also [here](https://stackoverflow.com/a/71665866/10440128) – milahu Mar 29 '22 at 16:38
  • If you have a better answer, why not give it as separate answer? – Sebastien Diot Mar 30 '22 at 09:01
  • done in [the more popular thread](https://stackoverflow.com/questions/22134471/git-diff-with-interleaved-lines/71665866#71665866) – milahu Mar 30 '22 at 09:03