0

Let's say I have "File 1" with content as below:

    123|abc|def|
    456|ghi|jkl|
    789|mno|pqr|

And I have "File 2" with content as below:

    123|abc|def|
    456|ghi|jkl|
    789|mno|pqr|
    134|rst|uvw|

As you can see "134" does not exist on File 1, therefore, my shell script should create a File 3, that contains as follows.

    134|rst|uvw|

How can I achieve this?

mfay
  • 53
  • 8
  • 1
    Stackoverflow is not a code writing service. Please show us what you have tried already. See: [How to create a Minimal, Complete, and Verifiable example.](https://stackoverflow.com/help/mcve). Also you may just be looking for something as simple as `grep -vf file1 file2 > file3` – Jedi Aug 01 '17 at 03:51
  • I just need to know what command is used best, not the whole code – mfay Aug 01 '17 at 03:55
  • Possible duplicate of [Fast way of finding lines in one file that are not in another?](https://stackoverflow.com/questions/18204904/fast-way-of-finding-lines-in-one-file-that-are-not-in-another) – Jedi Aug 01 '17 at 03:56
  • yes, similar to my problem. Thanks – mfay Aug 01 '17 at 03:57
  • 1
    `sort file1 file2 | uniq -u >file3`? – Cyrus Aug 01 '17 at 04:14
  • @Cyrus, no that's problematic since it gives uniques from both files (though admittedly OP did not specify if that is a problem) – Jedi Aug 04 '17 at 22:20
  • @Jedi yes, that is correct. Answer below is right and can be use for this problem, I've tried it – mfay Aug 05 '17 at 07:44

1 Answers1

1

In awk:

$ awk -F\| 'NR==FNR{a[$1];next}$1 in a==0' file1 file2    # > file3
134|rst|uvw|

Explained:

$ awk -F\| '     # field separator is |
NR==FNR {        # process first file
    a[$1]        # store the velue in the first field to hash a
    next         # next record
}                # below, processing the second file
$1 in a==0       # if first field is not found in the hash a, output the record
' file1 file2    # > file3 redirect to file3 if desired, else it's on your screen

Basically it stores the first field values of file1 to hash a and while processing file2 prints out the records where first field not found in the a. So it differs from grep solutions which compare the whole record, not just one field.

James Brown
  • 36,089
  • 7
  • 43
  • 59