How to find which line is missing in another file

Question

on Linux box I have one file as below A.txt

Second file as below B.txt

I want to know what is inside A.txt but not in B.txt i.e. it should print value 4

I want to do that on Linux.

Not sure why this was considered "unclear" - `comm` is the command you're looking for. In this specific case `comm -23 A.txt B.txt`. — twalberg, Jan 17 '14 at 16:52
It seems that "unclear" is being used for lack of a "no effort" closer. — showdev, Jan 17 '14 at 17:45
This is a great question - it described my exact problem in a way that I was able to find on google easily, and it has answers I can use. — rjmunro, Sep 03 '15 at 09:51

score 21 · Accepted Answer · answered Jan 17 '14 at 15:11

21

awk 'NR==FNR{a[$0]=1;next}!a[$0]' B A

didn't test, give it a try

answered Jan 17 '14 at 15:11

Kent

189,393
32
233
301

[Explanation of why it works](https://stackoverflow.com/a/32488079/213816) – nonsleepr Oct 17 '17 at 14:54
That would populate `a[]` with the superset of values from both files and so use more memory than necessary. It's better to do `awk 'NR==FNR{a[$0];next} !($0 in a)' B A` instead so it only has to hold the contents of `B` in memory. – Ed Morton Aug 26 '21 at 11:54

score 21 · Answer 2 · answered Jan 17 '14 at 19:53

21

Use comm if the files are sorted as your sample input shows:

$ comm -23 A.txt B.txt
4

If the files are unsorted, see @Kent's awk solution.

answered Jan 17 '14 at 19:53

Ed Morton

188,023
17
78
185

4

I think this should be marked as the actual answer! – Wang Jun 28 '14 at 22:47

score 8 · Answer 3 · answered Sep 03 '15 at 09:59

8

You can also do this using grep by combining the -v (show non-matching lines), -x (match whole lines) and -f (read patterns from file) options:

$ grep -v -x -f B.txt A.txt
4

This does not depend on the order of the files - it will remove any lines from A that match a line in B.

answered Sep 03 '15 at 09:59

rjmunro

27,203
20
110
132

Fantastic! Works like a charm, doesn't even require lines to be sorted. Thanks. – Vijay Varadan Sep 26 '15 at 05:18

score 4 · Answer 4 · answered Dec 24 '16 at 02:47

(An addition to @rjmunro's answer)

The proper way to use grep for this is:

$ grep -F -v -x -f B.txt A.txt
4

Without the -F flag, grep interprets PATTERN, read from B.txt, as a basic regular expression (BRE), which is undesired here, and can cause troubles. -F flag makes grep treat PATTERN as a set of newline-separated strings. For instance:

$ cat A.txt
&
^
[
]

$ cat B.txt
[
^
]
|

$ grep -v -x -f B.txt A.txt
grep: B.txt:1: Invalid regular expression

$ grep -F -v -x -f B.txt A.txt
&

This is obviously the best answer. It can be written slightly more compact as `grep -Fvx -f B.txt A.txt` — Serge Stroobandt, Oct 03 '20 at 20:44

gipsh · Answer 5 · 2014-01-17T15:12:08.963

3

Using diff:

diff --changed-group-format='%<' --unchanged-group-format='' A.txt B.txt

edited Jan 17 '14 at 15:12

answered Jan 17 '14 at 15:04

gipsh

578
1
3
20

2

I am not sure if `diff` will work on this problem. think about in A, I have from 1-10, but in B I have 9-1 and 100-200. all are unsorted. the output should be 10. – Kent Jan 17 '14 at 15:16
1

On the question the files are sorted. It wont work on unsorted files. You can use sort or any other shell command to sort the file. – gipsh Jan 17 '14 at 15:44

How to find which line is missing in another file

5 Answers5