What are NR and FNR and what does "NR==FNR" imply?

Question

I am learning file comparison using awk.

I found syntax like below,

awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2

I couldn't understand what is the significance of NR==FNR in this? If I try with FNR==NR then also I get the same output?

What exactly does it do?

See `Two-file Processing` on http://backreference.org/2010/02/10/idiomatic-awk/ — Etan Reisner, Sep 09 '15 at 14:17
Thanks for this link. It explains what many other awk fail to explain : the "awk-onic" way of writing awk scripts — Kiteloopdesign, Aug 11 '22 at 18:53

Tom Fenech · Answer 1 · 2023-01-16T09:02:39.487

141

In Awk:

FNR refers to the record number (typically the line number) in the current file.
NR refers to the total record number.
The operator == is a comparison operator, which returns true when the two surrounding operands are equal.

This means that the condition NR==FNR is normally only true for the first file, as FNR resets back to 1 for the first line of each file but NR keeps on increasing.
This pattern is typically used to perform actions on only the first file. It works assuming that the first file is not empty, otherwise the two variables would continue to be equal while Awk was processing the second file.

The next inside the block means any further commands are skipped, so they are only run on files other than the first.

The condition FNR==NR compares the same two operands as NR==FNR, so it behaves in the same way.

edited Jan 16 '23 at 09:02

answered Sep 09 '15 at 14:19

Tom Fenech

72,334
12
107
141

3

"=" is sometimes used to test equality, and sometimes to make an assignment. FNR==NR would be different than NR==FNR if the double equals sign was being used for assignment. So for someone unfamiliar with awk, such as this asker, it seems reasonable to ask if they're the same. – Todd Walton Dec 19 '18 at 18:28
@ToddWalton Good point! Another example: `a='3x'; if [[ $a == 3* ]]; then echo yes; fi` and you can not switch both sides of `==`. – Walter A Dec 19 '18 at 22:46
@WalterA yes that's true (in Bash, at least). Are you suggesting any improvement to my answer? – Tom Fenech Dec 20 '18 at 00:36
1

No, your answer is fine. I really like to see that the community likes our answers just as much. We use different styles and both are regarded very helpful. I just gave you an upvote, so for this moment we have the same number of upvotes. – Walter A Dec 20 '18 at 08:03
Terrific explanation @Tom Fenech Thank you! – Roger Costello Aug 22 '22 at 22:26
Just a heads up that `NR==FNR` doesn't work as expected if your first input file is empty. Having no lines means that NR is still zero going into the second file. – Mr. Llama Jan 15 '23 at 03:42
@Mr.Llama true, I updated my answer to mention that case, thanks. – Tom Fenech Jan 16 '23 at 09:03

Walter A · Answer 2 · 2017-02-09T22:10:31.477

93

Look for keys (first word of line) in file2 that are also in file1.
Step 1: fill array a with the first words of file 1:

awk '{a[$1];}' file1

Step 2: Fill array a and ignore file 2 in the same command. For this check the total number of records until now with the number of the current input file.

awk 'NR==FNR{a[$1]}' file1 file2

Step 3: Ignore actions that might come after } when parsing file 1

awk 'NR==FNR{a[$1];next}' file1 file2

Step 4: print key of file2 when found in the array a

awk 'NR==FNR{a[$1];next} $1 in a{print $1}' file1 file2

edited Feb 09 '17 at 22:10

answered Sep 09 '15 at 19:54

Walter A

19,067
2
23
43

4

Brilliant takedown of this one-liner. Is the semicolon in Step 1 necessary? – Tomasz Gandor Aug 08 '17 at 05:53
2

@TomaszGandor The semicolon is not needed in step 1. I could have added it in step 3, but `;next` is a weird addition (like to add `next` and need the semicolon in step 3). You can test step 1 with `awk '{a[$1]} END { for (k in a) { print "a[k]=" k } }' file1`. – Walter A Aug 08 '17 at 10:30

score 68 · Answer 3 · answered Sep 09 '15 at 14:24

68

Look up NR and FNR in the awk manual and then ask yourself what is the condition under which NR==FNR in the following example:

$ cat file1
a
b
c

$ cat file2
d
e

$ awk '{print FILENAME, NR, FNR, $0}' file1 file2
file1 1 1 a
file1 2 2 b
file1 3 3 c
file2 4 1 d
file2 5 2 e

answered Sep 09 '15 at 14:24

Ed Morton

188,023
17
78
185

is it possible also to print the number of the file being processed? is there a built-in variable for that? (I know we could create a variable for that and increment it every-time NR is one) – LEo Sep 19 '19 at 16:33
1

In GNU awk that variable is `ARGIND`, otherwise you can do `FNR==1{ print ++file_nr }`. – Ed Morton Sep 19 '19 at 19:14

score 23 · Answer 4 · edited Jul 27 '17 at 08:12

23

There are awk built-in variables.

NR - It gives the total number of records processed.

FNR - It gives the total number of records for each input file.

edited Jul 27 '17 at 08:12

Dhruvenkumar Shah

520
2
10
26

answered Sep 09 '15 at 14:19

sat

14,589
7
46
65

Don Kepler Brian Seremba · Answer 5 · 2018-01-28T07:18:42.297

Assuming you have Files a.txt and b.txt with

cat a.txt
a
b
c
d
1
3
5
cat b.txt
a
1
2
6
7

Keep in mind NR and FNR are awk built-in variables. NR - Gives the total number of records processed. (in this case both in a.txt and b.txt) FNR - Gives the total number of records for each input file (records in either a.txt or b.txt)

awk 'NR==FNR{a[$0];}{if($0 in a)print FILENAME " " NR " " FNR " " $0}' a.txt b.txt
a.txt 1 1 a
a.txt 2 2 b
a.txt 3 3 c
a.txt 4 4 d
a.txt 5 5 1
a.txt 6 6 3
a.txt 7 7 5
b.txt 8 1 a
b.txt 9 2 1

lets Add "next" to skip the first matched with NR==FNR

in b.txt and in a.txt

awk 'NR==FNR{a[$0];next}{if($0 in a)print FILENAME " " NR " " FNR " " $0}' a.txt b.txt
b.txt 8 1 a
b.txt 9 2 1

in b.txt but not in a.txt

 awk 'NR==FNR{a[$0];next}{if(!($0 in a))print FILENAME " " NR " " FNR " " $0}' a.txt b.txt
b.txt 10 3 2
b.txt 11 4 6
b.txt 12 5 7

awk 'NR==FNR{a[$0];next}!($0 in a)' a.txt b.txt
2
6
7

score 0 · Answer 6 · answered Mar 18 '22 at 04:26

Here is the pseudo code for your interest.

NR = 1
for (i=1; i<=files.length; ++i) {
    line = read line from files[i]
    FNR = 1
    while (not EOF) {
        columns = getColumns(line)

        if (NR is equals to FNR) { // processing first file
            add columns[1] to a
        } else { // processing remaining files
            if (columns[1] exists in a) {
                print columns[1]
            }
        }
        NR = NR + 1
        FNR = FNR + 1
        line = read line from files[i]
    }
}

What are NR and FNR and what does "NR==FNR" imply?

6 Answers6

Linked

Related