25

I wish to print lines that are in one file but not in another file. However, neither files are sorted, and I need to retain the original order in both files.

 contents of file1:
 string2
 string1
 string3

 contents of file2:
 string3
 string1

 Output:
 string2

Is there a simple script that I can accomplish this in?

unwind
  • 391,730
  • 64
  • 469
  • 606
j.lee
  • 601
  • 3
  • 10
  • 11

4 Answers4

51
fgrep -x -f file2 -v file1

-x match whole line

-f FILE takes patterns from FILE

-v inverts results (show non-matching)

Charles Brunet
  • 21,797
  • 24
  • 83
  • 124
  • @ysth : Yes, fgrep means (file)grep, hence the -f option. This goes back to older versions of unix. I think the gnu grep makes such a distinction redundant. ;-) Incidentally, FILE can contain many lines of patterns to match. – shellter Apr 28 '11 at 21:45
  • 2
    @shellter: no, fgrep means "fixed" grep, I think; fixed strings instead of regular expressions, also invokable as `grep -F`. I was suggesting it should be fgrep instead of grep, and indeed it has since been changed. – ysth Apr 28 '11 at 22:27
  • @ysth : Well in the book I read X years ago, they said (file)grep. Fixed(grep) per fixed strings also does a good job of describing the functionality. Thanks for sharing. Cordially! – shellter Apr 28 '11 at 23:20
  • @j.lee If this answer satisfy you, please consider accepting it. – Charles Brunet Jun 14 '11 at 17:35
6

In Perl, load file2 into a hash, then read through file1, outputing only lines that weren't in file2:

use strict;
use warnings;

my %file2;
open my $file2, '<', 'file2' or die "Couldn't open file2: $!";
while ( my $line = <$file2> ) {
    ++$file2{$line};
}

open my $file1, '<', 'file1' or die "Couldn't open file1: $!";
while ( my $line = <$file1> ) {
    print $line unless $file2{$line};
}
ysth
  • 96,171
  • 6
  • 121
  • 214
  • Leave the filenames as arguments and call the script something like `except` so you can say things like `except file2 file1 > result`. – reinierpost Apr 29 '11 at 08:50
4
awk 'FNR==NR{a[$0];next} (!($0 in a))' file2 file1
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
0

comm <(sort a) <(sort b) -3 → Lines in file b that are not in file a

David Okwii
  • 7,352
  • 2
  • 36
  • 29