How to get the difference (only additions) between two files in linux

Question

I have two files A1 and A2 (unsorted). A1 is previous version of A2 and some lines have been added to A2. How can I get the new lines that are added to A2?

Note: I just want the new lines added and dont want the lines which were in A1 but deleted in A2. When i do diff A1 A2, I get the additions as well as deletions but I want only additions.

Please suggest a way to do this.

are all added lines in A2 new for the file? I mean no duplicates with existing lines? — Kent, Mar 13 '13 at 12:10

scottkosty · Answer 1 · 2021-12-08T22:18:58.293

122

Most of the below is copied directly from @TomOnTime's serverfault answer here. At the bottom is an attempt that works on unsorted files, but the command sorts the files before giving the diff so in many cases it will not be what is desired. For well-formatted diffs of unsorted files, you might find the other answers more useful (thanks to @Fritz for pointing this out):

Show lines that only exist in file a: (i.e. what was deleted from a)

comm -23 a b

Show lines that only exist in file b: (i.e. what was added to b)

comm -13 a b

Show lines that only exist in one file or the other: (but not both)

comm -3 a b | sed 's/^\t//'

(Warning: If file a has lines that start with TAB, it (the first TAB) will be removed from the output.)

NOTE: Both files need to be sorted for "comm" to work properly. If they aren't already sorted, you should sort them:

sort <a >a.sorted
sort <b >b.sorted
comm -12 a.sorted b.sorted

If the files are extremely long, this may be quite a burden as it requires an extra copy and therefore twice as much disk space.

Edit: note that the command can be written more concisely using process substitution (thanks to @phk for the comment):

comm -12 <(sort < a) <(sort < b)

edited Dec 08 '21 at 22:18

answered Feb 26 '16 at 05:11

scottkosty

2,410
1
16
22

5

Since we are talking about `bash` here the last command can be simplified to `comm -12 <(sort < a) <(sort < b)` using process substitution. – phk Feb 24 '17 at 12:13
2

You sir, are my hero. – Henri-Maxime Ducoulombier Feb 04 '19 at 16:49
Why does an answer that works only on sorted files have so many upvotes, despite the question clearly stating that the two files are *unsorted*? – Fritz Nov 27 '21 at 11:51
This answer contains the command (see the last one in particular) that works if the files are not sorted. I think it makes sense to give the unsorted commands first since they are easier to understand, and to build up to the commands that do not assume sorted. – scottkosty Dec 06 '21 at 19:47
@scottkosty: Thanks for your response. Sorry, I didn't communicate my point clearly in my previous comment. My issue with this approach (in contrast to the `diff`-based answers) is that the order of the lines is jumbled up by sort, and in _many_ cases, order matters. Imagine comparing two source code files to see which function was added. Only, you _sort_ the source code alphabetically before... Sorting code does not make sense. This is why I think the other answers are better. I think you should at least mention this caveat somewhere in your answer. – Fritz Dec 08 '21 at 17:06
@Fritz Ah, that makes sense. I indeed did not even think about that. I agree, and edited in the caveat at the top. Thanks for clarifying. – scottkosty Dec 08 '21 at 22:19

score 80 · Accepted Answer · answered Mar 13 '13 at 12:07

80

diff and then grep for the edit type you want.

diff -u A1 A2 | grep -E "^\+"

answered Mar 13 '13 at 12:07

timrau

22,578
4
51
64

5

This will leave you with `+` at the beginning of the line – kgadek Aug 08 '16 at 13:18
10

You can remove those with sed: `diff -u A1 A2 | grep '^\+' | sed -E 's/^\+//'` – Javier Parra Nov 01 '18 at 23:07
@AmauryD your edit gets rid of the first `+++ A2` line, but leave a `+` sign at the beginning of every line, which is what the comment and sed command above are about. – remram Jun 10 '20 at 21:54
3

You can combine the `grep` and `sed` in one command: `diff -u A1 A2 | sed -n '/^+[^+]/ s/^+//p'` – remram Jun 10 '20 at 21:55
Drawback: this solution leaves lines which reference line numbers, e.g. `@@ -31,6 +630,8 @@` – JellicleCat Jul 06 '20 at 18:56

score 70 · Answer 3 · edited Jun 17 '19 at 13:52

70

You can try this

diff --changed-group-format='%>' --unchanged-group-format='' A1 A2

The options are documented in man diff:

       --GTYPE-group-format=GFMT
              format GTYPE input groups with GFMT

and:

       LTYPE is 'old', 'new', or 'unchanged'.
              GTYPE is LTYPE or 'changed'.

and:

              GFMT (only) may contain:

       %<     lines from FILE1

       %>     lines from FILE2

       [...]

edited Jun 17 '19 at 13:52

Ciro Santilli OurBigBook.com

347,512
102
1,199
985

answered Mar 13 '13 at 12:16

Premjith

1,128
1
11
21

5

can you please explain these options, i couldnt get them from the man page – user1004985 Mar 13 '13 at 12:29
3

See this link for more gnu [line group formats](http://www.gnu.org/software/diffutils/manual/html_node/Line-Group-Formats.html) – Premjith Mar 13 '13 at 12:39
1

The `''` after `--unchanged-group-format=''` looks like a single `"`, which won't work. Maybe change the `''` to `""` lest someone types your answer in with a single `"`. – lolololol ol Mar 29 '18 at 15:24
7

This is a much better answer than the selected answer, btw. Gives you exactly what you want, rather than the output littered with `+` symbols and an unnecessary meta line. – lolololol ol Mar 29 '18 at 15:25
For me, this also displays lines that were changed, not just completely new lines. – Florian Brucker Apr 16 '18 at 08:53
1

This is the best answer, – Annahri May 29 '20 at 15:22

Francesc Rosas · Answer 4 · 2021-02-01T10:43:15.113

13

A similar approach to https://stackoverflow.com/a/15385080/337172 but hopefully more understandable and easy to tweak:

diff \
  --new-line-format="%L" \
  --old-line-format="" \
  --unchanged-line-format="" \
  A1 A2

edited Feb 01 '21 at 10:43

answered Aug 06 '18 at 16:52

Francesc Rosas

5,915
2
30
16

score 7 · Answer 5 · edited Jun 18 '19 at 17:09

7

The simple method is to use :

sdiff A1 A2

Another method is to use comm, as you can see in Comparing two unsorted lists in linux, listing the unique in the second file

edited Jun 18 '19 at 17:09

HoldOffHunger

18,769
10
104
133

answered Mar 13 '13 at 12:09

Mihai8

3,113
1
21
31

score 7 · Answer 6 · answered Mar 13 '13 at 12:11

7

You can type:

grep -v -f A1 A2

answered Mar 13 '13 at 12:11

Zabador

607
3
12

Assume file `A1` contains one line `x`, and file `A2` contains one line `x` and the other line `xx`. This command outputs nothing since both lines in `A2` contains `x`. – timrau Mar 13 '13 at 12:16
2

`grep`'s `-x` (`--line-regexp`) can be used to ensure the entire line is matched. So if A1 contains `x` and A2 contains `xx`, a match will not be found. – Rusty Lemur Oct 01 '18 at 16:44
3

You probably also need to use the option `-F` or `--fixed-strings`. Otherwise `grep` will be interpreting `A1` as regular expressions. So if `A1` contains the line `.*`, it will match everything. So the entire command would be: `grep -vxF -f A1 A2` – wisbucky Aug 28 '19 at 20:55

score 7 · Answer 7 · answered Nov 21 '16 at 10:57

7

git diff path/file.css | grep -E "^\+" | grep -v '+++ b/' | cut -c 2-

grep -E "^\+" is from previous accepted answer, it is incomplete because leaves non-source stuff
grep -v '+++ b' removes non-source line with file name of later version
cut -c 2- removes column of + signs, also may use sed 's/^\+//'

comm or sdiff were not an option because of git.

answered Nov 21 '16 at 10:57

user1046885

171
2
7

1

Best answer ! This return exactly the lines that have been added and nothing more. This should be the accepted answer I think – Bancarel Valentin Jun 16 '17 at 13:24

score 0 · Answer 8 · edited Jul 18 '23 at 09:55

0

You can filter only the lines that need to be added to fileA to become equal to fileB. We include the first line that indicates the line number to apply the change.

diff $fileA $fileB | grep '^>' -B 1

edited Jul 18 '23 at 09:55

Marco Pagliaricci

1,366
17
31

answered Jul 14 '23 at 03:06

Alfonso Baqueiro Bernal

1
2

How to get the difference (only additions) between two files in linux

8 Answers8

Linked