I am trying to turn a awk BEGIN code into a loop. The original code was used to edit data based the value of the "Batch" column and output a file.
This is the original code (that works great):
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
for(i=1;i<=NF;i++){
if($i=="YBr"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
1
' obvs.csv > obvs1.csv
The above code takes the file obvs.csv that looks like this:
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,20141208,6,2,92,1,2014,424.4410055,NA,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,20141207,6,2,91,1,2014,380.94688,NA,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,20141207,6,2,91,1,2014,380.94688,NA,9
and edits the data to look like this (obvs1.csv):
NationalCowID,TestDate,Batch,LN,DIM,YBr,year,CH4,PLS,qtl
206004574,*,*,*,*,*,*,*,*,*,1
206004573,20141209,6,2,93,2,2014,436.4504712,NA,4
206004575,*,*,*,*,*,*,*,*,*,6
206004576,20141208,6,2,92,2,2014,424.4410055,NA,7
206004579,20141209,6,2,93,2,2014,436.4504712,NA,8
206004571,*,*,*,*,*,*,*,*,*,9
I would like to take this code and turn it into a loop so that a new file is created with the edits begin applied to each value of the "Batch" column(1-6).I've read some examples and the command explanation, but I don't fully understand what each part of the code does. For example, how do I code $j compared to the already coded $i? this is the loop I have tried to create:
for j in {1..6}
do
awk '
BEGIN{
FS=OFS=","
}
FNR==$j{
for(i=1;i<=NF;i++){
if($i=="Batch"){
field=i
}
if($i=="NationalCowID"){
value=i
}
}
}
$field==1{
for(i=value+1;i<=NF;i++){
$i="*"
}
}
$j
' obvs.csv > obvs$j.csv
done
In the end I am hoping to have 6 files as follows:
obvs1.csv -> only lines with batch = 1 are edited
obvs2.csv -> only lines with batch = 2 are edited
obvs3.csv-> only lines with batch = 3 are edited
obvs4.csv-> only lines with batch = 4 are edited
obvs5.csv-> only lines with batch = 5 are edited
obvs6.csv-> only lines with batch = 6 are edited
So the file name corresponds to the "Batch" being used as an indicator for which line to edit. i.e. for obvs2.csv, for data lines where Batch equals 2, all columns except for the first and last would be edited to * . So far, I end up with 6 files that are named correctly but the edits within the file are not correct. Any direction/ code explanation is greatly appreciated!