awk numbered columns and ignore errors

Question

The following works well and captures all 2nd column values for S_nn. The goal is to add numbers in the 2nd column.

awk -F "," '/s_/ {cons = cons + $2} END {print cons}' G.csv

How can I change this to add only when nnn is between N1 and N2 e.g. s_23 and s_24?

Also is it possible to consider 1 if a line has junk instead of numbers in the 2nd column?

S_22, 1
S_23, 0
S_24, 1
S_25, 1
S_26, ?

Sample input: sum s_24 to s_26

Sample output: 1+1+1=3 (the last one is for error)

Please show us sample Input in CODE TAGS and expected output in CODE TAGS. — RavinderSingh13, Feb 13 '18 at 02:17
Also please do add what are the conditions or adding 0 or 1 to 5th column? What do you mean by junk characters? and what is `nnn` between N1 and N2 too here? — RavinderSingh13, Feb 13 '18 at 02:22
Please add sample input and your desired output for that sample input to your question. — Cyrus, Feb 13 '18 at 05:48

kvantour · Accepted Answer · 2018-02-13T17:47:11.797

The solution is rather simple, all you need to do is perform a simple numeric test.

awk -v start=24 -v stop=26 '
     BEGIN { FS="[_,]" }
     (start <= $2 ) && ($2 <= stop) { s = s + (($3==$3+0)?$3:1) }
     END{ print s+0 }' <file>

which outputs

How does it work:

line 1 : defines the start and stop fields
BEGIN statement redefines the field separator as a _ or a ,, so now we have 3 fields.
the second line checks if field 2 (the number) is between start and stop, if so perform the sum.
the field 3 is checked if it is a number by testing the condition $3==$3+0, if this fails, it is assumed to be 1

If you want to see the numbers printed, you can do :

awk -v start=24 -v stop=26 '
     BEGIN{ FS="[_,]" }
     (start <= $2 ) && ($2 <= stop) {
        v = ($3==$3+0)?$3:1
        s = s + v
        printf "%s%d", (c++?"+":""), v
     }
     END{ printf "=%d\n", s }' <file>

output :

1+1+1=3

The printf statement always prints "+"$3 except on the first time. This is checked by keeping track of a counter c. By default the value of c is set to zero. The entry (c++?"+":"") determines if we are printing the first entry or not. c++ will return the value of c and afterwards sets c to the value c+1, This is called a post increment operator. Thus, the first time, c=0 and (c++?"+":"") returns "" and sets c to 1. The second time, (c++?"+":"") returns "+" and sets c to 2.

the second block worked well, the first one needs some correction: returned a 1 — Tims, Feb 13 '18 at 17:27
oops...I was asking about (c++?"+":"") in your line of awk code — Tims, Feb 13 '18 at 17:38

awk numbered columns and ignore errors

1 Answers1