0

Ok so I'm still learning the command line stuff like grep and diff and their uses within the scope of my project, but I can't seem to wrap my head around how to approach this problem.

So I have 2 files, each containing hundreds of 20 character long strings. lets call the files A and B. I want to search through A and, using the values in B as keys, locate UNIQUE String entries that occur in A but not in B(there are duplicates so unique is the key here)

Any Ideas?

Also I'm not opposed to finding the answer myself, but I don't have a good enough understanding of the different command line scripts and their functions to really start thinking of how to use them together.

  • possible duplicate of [Unix command to find lines common in two files](http://stackoverflow.com/questions/373810/unix-command-to-find-lines-common-in-two-files) – Jonathan Leffler Jan 22 '14 at 20:59
  • Verdammelt, thanks for the help! I like your answer because it shows me proper usage of grep and also introduces sort which is something I didn't know I could do. – user3225219 Jan 24 '14 at 01:25

2 Answers2

1

Look up the comm command (POSIX comm ) to do this. See also Unix command to find lines common in two files.

Community
  • 1
  • 1
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
1

There are two ways to do this. With comm or with grep, sort, and uniq.

comm

comm afile bfile

comm compares the files and outputs 3 columns, lines only in afile, lines only in bfile, and lines in common. The -1, -3 switches tell comm to not print out those columns.

grep sort uniq

grep -F -v -file bfile afile | sort | uniq

or just

grep -F -v -file bfile afile | sort -u

if your sort handles the -u option.

(note: the command fgrep if your system has it, is equivalent to grep -F.)

verdammelt
  • 922
  • 10
  • 22
  • Or `grep -F` in lieu of `fgrep`. The official POSIX standard for [`grep`](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html) no longer includes `egrep` (`grep -E`) or `fgrep` (`grep -F`) -- but real world implementations still include the original names, possibly as alternative names to the main `grep` binary. – Jonathan Leffler Jan 22 '14 at 22:00