Using an if block in awk

Question

I'm processing a file in awk.

I want to pass along the rows in the file that have blanks in column positions 25 through 34 and I want to do work on the rows that have blanks in column positions 10 through 19. Specifically I want to replace the blanks in columns positions 10 through 19 with 0s. That way the output file will have the original rows with blanks in 25-34 untouched. and the rows with blanks in 10-19 with have been replaced with '0's. So the output file will be the same as the input file only with zeros in the relevant rows in positions 10-19. The file looks like this:

###########################################
#########          ########################
###########################################
###########################################
###########################################
###########################################
###########################################
###########################################
#########          #####          #########
###########################################
###########################################
###########################################

I know I have to use an if block but I've never used one before in awk. The syntax below is what I think I need but please help me with the details. Specifically what I'm using to specify 'blanks' in the if statements.

I apologize ahead of time for the bad syntax. This is my first time using an If block in awk. I know the syntax doesn't work, which is one of the reasons I'm posting this.

cat scr2 | awk 'BEGIN {
    pos1=substr($0,25,10); 
    pos2=substr($0,10,10);

      if (pos1 = ^[[:blank:]]$) 
         printf $0 
      else if (pos2 == ^[[:blank:]]$)
         {val=substr($0,25,10)} 
         gsub(/ /,0,val){$0=substr($0,1,24) val substr($0,35)} 1}'`

The sample output would be :

###########################################
#########0000000000########################
###########################################
###########################################
###########################################
###########################################
###########################################
###########################################
#########          #####          #########
###########################################
###########################################
###########################################

So the row with blanks only at positions 10-19 gets changed and the row with blanks at both 10-19 and 25-34 get left alone.

How do you expect anyone to read that script? For pity's sake — use newlines; use newlines liberally. You seem to be missing `/…/` delimiters around some regexes, and to be using `==` where you need `~` (the regex match operator). Or you're missing double quotes around strings. There is nothing in `$0` inside the `BEGIN` block of an Awk script — nothing has been read when the `BEGIN` block is executed. — Jonathan Leffler, Jul 15 '21 at 16:55
As an aside, get rid of the [useless `cat`.](https://stackoverflow.com/questions/11710552/useless-use-of-cat) — tripleee, Jul 15 '21 at 18:57
Follow-up question: https://stackoverflow.com/questions/68412741/using-awk-to-change-a-block-of-blanks-to-0s-while-screen-out-rows-with-blanks — tripleee, Jul 16 '21 at 19:54

score 3 · Answer 1 · edited Jul 15 '21 at 19:05

3

With your shown samples, please try following awk code, written and tested in GNU awk, should work in any awk.

awk '
substr($0,10,10) ~ /^ +$/ && substr($0,20) !~ / / {
  $0=substr($0,1,9) "0000000000" substr($0,20)
}
1
' Input_file

Explanation: Simple explanation would be, checking 2 conditions in main program of awk. 1st to make sure position 10th to 20th contains only space AND 2nd rest of the line's values are NOT having spaces in it, if this is the case then enter zeroes in place of spaces and print edited/non-edited lines.

edited Jul 15 '21 at 19:05

Jonathan Leffler

730,956
141
904
1,278

answered Jul 15 '21 at 18:54

RavinderSingh13

130,504
14
57
93

@DavidC.Rankin, Thank you sir. – RavinderSingh13 Jul 15 '21 at 19:40
1

Always an interesting read, awk is getting less of a riddle to me now :-) – The fourth bird Jul 15 '21 at 19:42
1

@Thefourthbird, your welcome, you are a champ in regex, I have same feeling for your answers too cheers. – RavinderSingh13 Jul 15 '21 at 19:43
1

Hi Guys, The answers are all great. I see some of the answers are 'seeing' both blocks of space and only taking action on the first. I think I should have been more specific. The data I'm working with is Bank data so I couldn't put actual numbers down. I thought excluding blanks anywhere would be fine. In my case, on the lines we don't want, most of the columns are blank so I tried to just use '#' for the 'generic' data. In reality the rows we want to screen out are mostly blank. They contain marco info. I'll add the new example rows in the main section. – Carbon Jul 15 '21 at 20:50
@Carbon, Request you to please revert your question's latest update to previous one as many users had replied as per your previous question and it will be waste of their efforts as well as it will confuse users. You could open a fresh question for same and it could be discussed there. – RavinderSingh13 Jul 16 '21 at 15:28
OK, I example rows added. Also Hi Ravinder, my data in the first example was a bit too naive. I If you look at the example rows I just posted I'm sure the fix to that would only be a small tweek to the code although I do understand your point that users will log on and look up this question and although the initial question would be answered, the secondary update would give answers that would be different. However both sets of potential answers would work and the update I just posted could be taken as sort of a Part II to the question. I just caught this bug which caused this update. – Carbon Jul 16 '21 at 15:35
Hi Again Ravinder, Do you think this change will warrant a new post? If you think so please let me know. I would like my questions to contribute and don't want to confuse or mislead anyone. – Carbon Jul 16 '21 at 15:45
@Carbon, yes please I think you should revert this change and open a new question(make sure you add your tried code like you added in this question) and we could discuss it there, changing or updating question is not encouraged IMHO, cheers. – RavinderSingh13 Jul 16 '21 at 15:47
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/234978/discussion-between-carbon-and-ravindersingh13). – Carbon Jul 16 '21 at 16:21
@Carbon, sorry I can't join chat room you could comment here. – RavinderSingh13 Jul 16 '21 at 16:23

David C. Rankin · Answer 2 · 2021-07-15T19:17:33.247

Another option is using match() to fill the RSTART built-in variable specifying the start of the block of spaces. You can then use substr() in a regex comparison to verify the remainder of the line is comprised only of '#' characters. For example:

awk '{if (match($0,/[ ]{10}/) && RSTART == 10 && substr($0,20) ~ /^#*$/) sub(/[ ]{10}/,"0000000000")}1' file

The above will match() each line with 10-spaces beginning at column 10 and replace them with 10 '0's.

Example Use/Output

With your input in the file named file, you would have:

$ awk '{if (match($0,/[ ]{10}/) && RSTART == 10 && substr($0,20) ~ /^#*$/) sub(/[ ]{10}/,"0000000000")}1' lines
###########################################
#########0000000000########################
###########################################
###########################################
###########################################
###########################################
###########################################
###########################################
#########          #####          #########
###########################################
###########################################
###########################################

score 3 · Answer 3 · answered Jul 15 '21 at 19:09

3

I'd use sed here:

sed -E 's/^(.{9}) {10}(.{5}[^ ]{10})/\10000000000\2/' file

answered Jul 15 '21 at 19:09

glenn jackman

238,783
38
220
352

Very nicely done. (and much shorter than mine) I'll give it a nod. – David C. Rankin Jul 15 '21 at 19:19

Pierre François · Answer 4 · 2021-07-15T19:37:19.660

2

You can do it in awk without any if statement:

awk '{print gensub(/^(.{9}) {10}([^ ]{24})/, "\\10000000000\\2", "g")}' file

This will replace 10 blanks by 10 0 in positions 10 to 19 only on the lines where there are no blanks in positions 20 to 43, which is what you want, I guess.

edited Jul 15 '21 at 19:37

answered Jul 15 '21 at 19:25

Pierre François

5,850
1
17
38

Only note would be that `gensub()` is gawk and may not be available in other awks. – David C. Rankin Jul 15 '21 at 19:47

Using an if block in awk

4 Answers4

Linked