0

I've UTF-8 plain text lists of usernames, 1 per line, in list1.txt and list2.txt. Note, in case pertinent, that usernames may contain regex characters e.g. ! ^ . ( and such as well as spaces.

I want to get and save to matches.txt a list of all unique values occurring in both lists. I've little command line expertise but this almost gets me there:

grep -Ff list1.txt list2.txt > matches.txt

...but that is treating "jdoe" and "jdoe III" as a match, returning "jdoe III" as the matched value. This is incorrect for the task. I need the per-line pattern match to be the whole line, i.e. from ^ to $. I've tried adding the -x flag but that gets no matches at all (edit: see comment to accepted answer - I got the flag order wrong).

I'm on OS X 10.9.5 and I don't have to use grep - another command line (tool) solving the problem will do.

mwra
  • 317
  • 3
  • 11
  • If the files are sorted then `comm -1 -2 list1.txt list2.txt` might do what you want. – Etan Reisner Feb 12 '15 at 16:18
  • possible duplicate of [finding contents of one file into another file in unix shell script](http://stackoverflow.com/questions/15059422/finding-contents-of-one-file-into-another-file-in-unix-shell-script) – tripleee Feb 12 '15 at 16:49
  • I was hoping to find a better duplicate, one which suggests `grep -Fxf`. This is a FAQ so I'm sure there is one, but I could not find it. – tripleee Feb 12 '15 at 16:51
  • Well, I did spend some time looking for SO answers as I figured this is FAQ but many things were close but answered different questions (different enough to not help someone inexperienced with shell/CL tools). The most useful answer I found was this [How to grep the exact match](http://stackoverflow.com/questions/4709912/how-to-grep-the-exact-match), though that failed for the use case stated above. – mwra Feb 12 '15 at 17:01
  • The link at head to a suggested alternate thread *does not answer this question* (perhaps someone can delete that banner - it doesn't really help someone looking for an answer to *this* question). Likewise, the link given by tripleee doesn't answer this question. – mwra Feb 13 '15 at 09:12

4 Answers4

2

All you need to do is add the -x flag to your grep query:

grep -Fxf list1.txt list2.txt > matches.txt

The -x flag will restrict matches to full line matches (each PATTERN becomes ^PATTERN$). I'm not sure why your attempt at -x failed. Maybe you put it after the -f, which must be immediately followed by the first file?

Adam Katz
  • 14,455
  • 5
  • 68
  • 83
  • Yes, both `-Fxf` or `-xFf` now seem to work. I'm sure I tried the three flags in all sort of order combos before posting. Oh well. What I'd missed, through inexperience, is the `f` flag *must* come last in order that the next item in the CL is the list of file names. Thanks! – mwra Feb 12 '15 at 19:32
  • Yeah, grep isn't going to complain on e.g. `grep -Ffx list1.txt list2.txt` if you have a file named `x`. – Adam Katz Feb 12 '15 at 19:35
1

This awk will be handy than grep here:

awk 'FNR==NR{a[$0]; next} $0 in a' list1.txt list2.txt > matches.txt

$0 is the line, FNR is the current line number of the current file, NR is the overall line number (they are only the same when you are on the first file). a[$0] is a associative array (hash) whose key is the line. next will ensure that further clauses (the $0 in a) will not run if the current clause (the fact that this is the first file) did. $0 in a will be true when the current line has a value in the array a, thus only lines present in both will be displayed. The order will be their order of occurence in the second file.

Adam Katz
  • 14,455
  • 5
  • 68
  • 83
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Yes this does work (in my example case) - even if I don't understand how! I'll wait a bit and see if a grep based answer shows up or will come back and accept this. Thanks. – mwra Feb 12 '15 at 17:03
  • @AdamKatz: Many thanks for adding explanation here. I somehow missed OP's comment and forgot to add some explanation here. – anubhava Feb 12 '15 at 18:55
0

A very simple and straightforward way to do it that doesn't require one to do all sorts of crazy things with grep is as follows

cat list1.txt list2.txt|grep match > matches.txt
Not only that, but it's also easier to remember, (especially if you regularly use cat).
  • That command line, as stated, doesn't work. with data in `list1.txt` and `list2.txt`, the resulting `matches.txt` is empty. What 'match' does is unclear - is it a flag to grep or another command. My (OS 10.9.5) has no entry for 'match' in `man grep` nor is 'match' recognised as a command. This CL seems to answer a different question to the one posed. – mwra Feb 13 '15 at 09:06
0

grep -Fwf file1 file2 would match word to word !!

sane
  • 115
  • 1
  • 8