This is a follow up question from a previous post. I'm processing a file in awk. I want to pass along the rows in the file that have blanks in column positions 45 through 50 and I want to do work on the rows that have blanks in column positions 60 through 73.
Specifically I want to replace the blanks in columns positions 60 through 73 with 0s.
That way the output file will have the original rows with blanks in 45-50 untouched. and the rows with blanks in 60-73 with have been replaced with '0's. So the output file will be the same as the input file only with zeros in the relevant rows in positions 60-73.
Because I work at a bank I had to give an example that wouldn't let me show bank data. My previous post was not an accurate description of the issue.
I've mocked up the data more accurately replacing the data with all 0s. As you can see there are rows that look like they contain mostly meta data info and are different from the actual data rows. That's what I'm trying to filter out. The first row starting with 0000
s is the example of the row I'm trying to fix. It has a blank starting at position 60-73. That's the position I want to change to all '0s'. The second row is an example of a regular row with no errors in it. The third and fourth rows are the meta data rows I want to skip. In the this case I've chosen column positions 45-50 as the blank columns I want to identify that tell me to skip those rows, This is because in both the first two data rows, those columns are guaranteed to have data in them. Hopefully that clears it up. The data example is shown below (first 2 rows are not present in the real data, they're just added here so you can easily see the character positions):
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
0000 000000000 000000000000000 00000000
0000 00000000000 0000000 000000000000000000000000000000000000
0000 000000
0000 0000000000000000000000000000000000 0000000 T0000000000
All the answers I was given worked on the old example file except in the above data, it was processing the blanks to '0's on the last two meta data rows. Those were the:
awk '{print gensub(/^(.{9}) {10}([^ ]{24})/, "\\10000000000\\2", "g")}' file
awk '{if (match($0,/[ ]{10}/) && RSTART == 10 && substr($0,20) ~ /^#*$/) sub(/[ ]{10}/,"0000000000")}1' file
sed -E 's/^(.{9}) {10}(.{5}[^ ]{10})/\10000000000\2/' file
awk '
substr($0,10,10) ~ /^ +$/ && substr($0,20) !~ / / {
$0=substr($0,1,9) "0000000000" substr($0,20)
}
1
' Input_file
All of these scripts worked great on the old example file but processed the meta data rows on the above example. I was naive in creating the first example data set in that it did not accurately depict my issue . I was trying to make it easy for the people reading my post to see the data. I wasn't aware the simplification had changed the questions. For the unfamiliar with the previous post I'm including it below:
I hope I've been thorough enough. Please let me know if you have any questions.
This is my attempt. I've been working with this over the weekend and found the soultions using match would not work as well as the solutions using substr. This is my attempt. The "longtst" file is the file printed above
cat longtst|awk '{if (substr($0,45,5) !~ /^[[:blank:]]*$/)
{if (substr($0,60,13) ~ /^[[:blank:]]*$/)
$0=substr($0,1,59) gsub(/ /,0,substr($0,60,13)) substr($0,74) }
else print $0 }'
I'm getting a
"-ksh: .: syntax error: `else' unexpected"
The expected output would be the following:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
0000 000000000 000000000000000000000000000000000000
0000 00000000000 0000000 000000000000000000000000000000000000
0000 000000
0000 0000000000000000000000000000000000 0000000 T0000000000
As you can see, the 13 blanks in the first row get filled in with 0s starting at col pos 60 through 73.As per Ed's note, the rows that have blanks at col pos 45 through 50 just get passed through. Thanks!