4

I compare two files with this command

comm -13 file1 file2

It works perfectly and says me differences. But I would like to show me also the line number (lines unique in second file).

file1:

a
d
e
f
g

file2:

a
b
c
d
e

I do:

 comm -13 file1 file2

Output

b
c

But I need the line numbers where b and c are in file2, desired output:

2
3
thanasisp
  • 5,855
  • 3
  • 14
  • 31
defekas17
  • 155
  • 1
  • 9

2 Answers2

3

Using awk:

$ awk 'NR==FNR{a[$0];next}!($0 in a){print FNR}' file1  file2

Output:

2
3

Edit: As presented in the OP, the comm behaves differently when file file2 has duplicates. Below solution should fix that (see comments and thanks @EdMorton):

$ awk '
NR==FNR {
    a[$0]++
    next
}
{
    if(!($0 in a)||a[$0]<=0)
        print FNR
    else a[$0]--
}' file1 file2

Output now (file2 has a duplicate entry d where FNR==5):

2
3
5

Hopefully there aren't that many more pitfalls waiting...

James Brown
  • 36,089
  • 7
  • 43
  • 59
  • 2
    That's not quite the same. Add a 2nd `d` line to file `a` then try `comm -13 b a` and your awk command and you'll find the former correctly outputs `d` while the latter doesn't output the line number that'd be associated with `d` because it doesn't take duplicates into account. – Ed Morton Sep 20 '20 at 12:48
0
awk 'NR==FNR{a[$0]++; next} (--a[$0]) < 0{print FNR}' file1 file2

e.g. Using a modified file2 that includes an extra d line to prove that duplicate values are handled correctly:

$ cat file2
a
b
c
d
d
e

$ comm -13 file1 file2
b
c
d

$ awk 'NR==FNR{a[$0]++; next} (--a[$0]) < 0' file1 file2
b
c
d

$ awk 'NR==FNR{a[$0]++; next} (--a[$0]) < 0{print FNR}' file1 file2
2
3
5
Ed Morton
  • 188,023
  • 17
  • 78
  • 185