I've come across an error today and would like other people's opinions on a solution beyond what I have. The error is in a dataset. The data in the last column/field of the first and second row/record should be the same, and the second to last column/field of row/record 1 is always "1". The problem is when this is not so and and the steps needed to correct it.
The incorrect data is as such, in a file called "sample.txt":
5@Comedia @5@3@2@3@1/2 @3@1.6 @1@2 1/2@11@14 1/4
3@Melanistic@3@4@2@4@1 1/2@4@2 3/4@3@5 @2 @4 3/4
2@Pure @4@5@5@5@3 1/2@5@4 3/4@5@8 @3 @6 1/2
4@Profit @2@2@1@2@1.6 @1@1.6 @2@2 1/2@4 @6 1/2
1@Whammy @1@1@1@1@1.6 @2@1.6 @4@5 1/2@5 @8 1/4
The correct data should look like this:
5@Comedia @5@3@2@3@1/2 @3@1.6 @1@2 1/2 @1@4 3/4
3@Melanistic@3@4@2@4@1 1/2@4@2 3/4@3@5 @2@4 3/4
2@Pure @4@5@5@5@3 1/2@5@4 3/4@5@8 @3@6 1/2
4@Profit @2@2@1@2@1.6 @1@1.6 @2@2 1/2 @4@6 1/2
1@Whammy @1@1@1@1@1.6 @2@1.6 @4@5 1/2 @5@8 1/4
My current solution is a multi-step process I have a feeling can be streamlined. Any suggestions are highly appreciated.
1)Create a bash variable:
length=$(cat sample.txt |awk -F@ 'NR==2{print $NF}')
2)Create a file with the correct information in row 1:
awk -F@ -v l="$length" 'NR==1{$(NF-1)=1;$NF=l;print $0}' OFS=@ sample.txt >sample1.txt
3)Append the remaining info to the created correct row file
awk -F@ 'NR>1{print $0}' sample.txt >>sample1.txt
Is there an awk, sed, or Perl one liner (or combinations of pipes) that can accomplish the three steps above in one?