-2

I am learning file comparison using awk.

I found syntax like below,

awk '
(NR == FNR) { 
    s[$0]
    next 
} 
{ 
    for (i=1; i<=NF; i++) 
        if ($i in s) 
            delete s[$i] 
} 
END { 
    for (i in s) 
        print i 
}' tests tests2

I couldn't understand what is the Syntax ...Can you please explain in detail?

What exactly does it do?

James Brown
  • 36,089
  • 7
  • 43
  • 59
  • 3
    `NR`, `FNR`, `NF` are *Built-in variables*, if you want to know more about them I suggest reading [8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR](https://www.thegeekstuff.com/2010/01/8-powerful-awk-built-in-variables-fs-ofs-rs-ors-nr-nf-filename-fnr/) – Daweo Feb 22 '22 at 09:00
  • 2
    One of the mhe most important unix commands is `man` for _manual_. Open a terminal, type `man awk` and you'll find the answers to most of your questions. If tomorrow you start learning `find` try `man find`. In some cases the shown manual is a short version; the long version is frequently available with the `info` command (e.g. `info sed`). – Renaud Pacalet Feb 22 '22 at 09:15
  • 1
    This post can help you: https://stackoverflow.com/questions/32481877/what-are-nr-and-fnr-and-what-does-nr-fnr-imply – Carlos Pascual Feb 22 '22 at 12:02
  • 1
    *I found syntax like below,* where did you find it? Was not there any corresponding explanation? – Daweo Feb 22 '22 at 12:34

1 Answers1

4
awk '                      # use awk
(NR == FNR) {              # process first file
    s[$0]                  # hash the whole record to array s
    next                   # process the next record of the first file
} 
{                          # process the second file
    for (i=1; i<=NF; i++)  # for each field in record
        if ($i in s)       # if field value found in the hash
            delete s[$i]   # delete the value from the hash
} 
END {                      # after processing both files 
    for (i in s)           # all leftover values in s
        print i            # are output
}' tests tests2

For example, for files:

tests:
1
2
3

tests2:
1 2
4 5

program would output:

3
James Brown
  • 36,089
  • 7
  • 43
  • 59
  • Thanks a lot @James ...can you explain lil bit more line 3 and 4. – ironman junior Feb 22 '22 at 11:12
  • `s[$0]` stores a record as a key into an array for a quick lookup when searching values from the second file. `next` breaks the processing of the program for the current record, fetches the next record from the file and starts processing it. The point of `next` here is to stop the program from processing first file records in the program block meant for the records of the second file. – James Brown Feb 22 '22 at 11:31