AWK to search a specific sequence and if found search in the next line another sequence

Question

I'm trying to find a string in a txt format and each time it's found then look for an specific string to change for another string and avoiding reading the first sequence of the line.

Imagine the nexts hexadecimal txt:

0000  09 06 07 04 00 00 01 00 1d 03 4b 2c a1 2a 02 01   
0010  b7 09 01 47 30 22 a0 0a 80 08 33 04 03 92 22 14   
0020  17 f0 a1 0b 80 00 81 00 84 01 00 86 00 85 00 83   
0030  07 91 94 71 06 00 07 19

0000  09 06 07 04 00 00 01 00 2b 03 4b 27 a1 25 02 01   
0010  00 09 01 66 30 1d a0 0a 80 08 33 04 03 92 22 14   
0020  17 f0 a1 06 82 00 84 00 85 00 82 07 91 94 71 06   
0030  00 07 19

Expected output:

0000  09 06 07 04 00 00 01 00 1d 03 4b 2c a1 2a 02 01   
0010  b7 09 01 47 30 22 a0 0a 80 08 33 04 03 92 22 14   
0020  12 f0 a1 0b 80 00 81 00 84 01 00 86 00 85 00 83   
0030  07 91 94 71 06 00 07 19

0000  09 06 07 04 00 00 01 00 2b 03 4b 27 a1 25 02 01   
0010  00 09 01 66 30 1d a0 0a 80 08 33 04 03 92 22 14   
0020  12 f0 a1 06 82 00 84 00 85 00 82 07 91 94 71 06   
0030  00 07 19

I need that each time I encounter a 4b sequence to look for 14 sequence and if found look in the next line the first string, which is in this case 17 and if this string is 17 change to 12. What you have on the left is a sequence which gives you the line of the txt you are, so that it's not interesting to analysis because it's repeated in each paragraph

What I have is the next:

gawk  ' { for ( i = 1; i <= NF; ++i ) {

    if ( $i == "4b" )
        r = 1
    if ( r && ($i == "14" ))
        t = 1
    if ( r && t && $i == "17") {
        r = 0
        t = 0
        $i = "12"

    }
  }
}
1 ' example.txt example2.txt

However, I don't know well how to avoid reading the first xxxx sequence of each line

Thanks for sharing your efforts, could you please post samples of expected output too in your question for better understanding of your question, thank you. — RavinderSingh13, Feb 17 '21 at 13:12
Thanks much better now, `I encounter a 4b sequence to look for 14 sequence` so is it after `4b` you want to look for 14th item and check its value is 17 or not? Could you please confirm on this one, thank you. — RavinderSingh13, Feb 17 '21 at 13:18
No, what I want is to encounter a 4b sequence. Then look for 14 sequence and then look for 17 sequence in the next line just after the 0000 sequence or 0010 etc. I mean the sequence on the left doesn't matter — Max, Feb 17 '21 at 13:22

Ed Morton · Accepted Answer · 2021-02-17T17:22:33.013

As in life, when processing data it's much easier to make decisions based on what has happened in the past (data you have read) rather than what will happen in the future (data you are going to read) so instead of saying "if I have X and the thing after it is Y" write your requirements as "if I have Y and the thing before it was X" and the software to implement it usually becomes much more simple and obvious.

Is this what you're trying to do (using any awk in any shell on every Unix box):

$ cat tst.awk
($2 == 17) && (p1 ~ / 14 /) && (p2 ~ / 4b /) {
    sub(/ 17 /," 12 ")
}
{ p2=p1; p1=$0" "; print }

$ awk -f tst.awk file
0000  09 06 07 04 00 00 01 00 1d 03 4b 2c a1 2a 02 01
0010  b7 09 01 47 30 22 a0 0a 80 08 33 04 03 92 22 14
0020  12 f0 a1 0b 80 00 81 00 84 01 00 86 00 85 00 83
0030  07 91 94 71 06 00 07 19

0000  09 06 07 04 00 00 01 00 2b 03 4b 27 a1 25 02 01
0010  00 09 01 66 30 1d a0 0a 80 08 33 04 03 92 22 14
0020  12 f0 a1 06 82 00 84 00 85 00 82 07 91 94 71 06
0030  00 07 19

If that's not all you need then edit your question to clarify your requirements and provide more truly comprehensive sample input/output including cases that the above doesn't work for.

I'm using sub(/ 17 /," 12 ") above instead of $2=12 to preserve white space between fields. It's safe to do that because the target field is $2, if it was any other field you couldn't do that as a field before the target one might also be 17. There are various sub()/match()/substr() ways to handle that of course.

RavinderSingh13 · Answer 2 · 2021-02-17T14:55:10.517

Based on your shown samples, could you please try following, written and tested with GNU awk.

awk '
!NF{ found1=found2=0 }
/(^|[[:space:]])4b([[:space:]]|$)/{
  found1=1
  print
  next
}
found1 && /(^|[[:space:]])14([[:space:]]|$)/{
  found2=1
  print
  next
}
found1 && found2{
  for(i=2;i<=NF;i++){
    if($i==17){ $i=12 }
  }
  print
  next
}
1
'  Input_file

Explanation: Adding detailed explanation for above.

awk '                                                 ##Starting awk program from here.
!NF{ found1=found2=0 }
/(^|[[:space:]])4b([[:space:]]|$)/{                   ##Checking condition if line has 4b with spaces or coming in starting or ending of line.
  found1=1                                            ##Then set found to 1 here.
  print                                               ##Printing the current line here.
  next                                                ##next will skip all further statements from here.
}
found1 && /(^|[[:space:]])14([[:space:]]|$)/{         ##Checking if found1 is SET AND if line has 14 with spaces or coming in starting or ending of line.
  found2=1                                            ##Setting found2 to 1 here.
  print                                               ##Printing the current line here.
  next                                                ##next will skip all further statements from here.
}
found1 && found2{                                     ##Checking condition if found1 and found2 is SET then do following.
  for(i=2;i<=NF;i++){                                 ##Traversing through all fields here starting from 2nd field.
    if($i==17){ $i=12 }                               ##Checking condition if field value is 17 then make it 12.
  }
  print                                               ##Printing current line.
  next                                                ##next will skip all further statements from here.
}
1                                                     ##1 will print current line.
'  Input_file                                         ##Mentioning Input_file name here.

Done, take a look when possible – Luka Feb 17 '21 at 17:44 — Luka, Feb 17 '21 at 17:44

anubhava · Answer 3 · 2021-02-17T17:00:24.033

3

Your attempted awk command is pretty good, you just need to make sure to use -v RS= (empty RS) to make each paragraph a record.

Following should work for you in gnu-awk:

cat fmt.awk

{
   ORS = RT  # set ORS same RT variable populated using RS
}
{
   r = t = p = ""
   for ( i = 1; i <= NF; ++i ) {
      # set r = 1 when we get 4b
      if ( $i == "4b" )
         r = 1
      # set t = 1 when we get 14 when r==1
      if ( r && $i == "14" )
         t = 1
      # when we get 4 digits save the position
      if ($i ~ /^[0-9]{4}$/)
         p = i+1
      # replace 17 with 12 when we get 17 when t==1
      if ( t && p == i && $i == "17" ) {
         $0 = gensub("((\\S+\\s+){"i-1"})\\S+", "\\112", 1)
         break
      }
   }
} 1

Run it as:

awk -v RS= -f fmrt.awk file

0000  09 06 07 04 00 00 01 00 1d 03 4b 2c a1 2a 02 01
0010  b7 09 01 47 30 22 a0 0a 80 08 33 04 03 92 22 14
0020  12 f0 a1 0b 80 00 81 00 84 01 00 86 00 85 00 83
0030  07 91 94 71 06 00 07 19

0000  09 06 07 04 00 00 01 00 2b 03 4b 27 a1 25 02 01
0010  00 09 01 66 30 1d a0 0a 80 08 33 04 03 92 22 14
0020  12 f0 a1 06 82 00 84 00 85 00 82 07 91 94 71 06
0030  00 07 19

edited Feb 17 '21 at 17:00

answered Feb 17 '21 at 14:54

anubhava

761,203
64
569
643

Could you explain the two last if conditionals that you made? – Max Feb 17 '21 at 15:47
@Max: I have added explanation in my answer. Please let me know if any part if not clear – anubhava Feb 17 '21 at 16:06
1

The OP said `... look for 14 sequence and if found look in the next line ... for 17`. Your script doesn't quite do that, it looks in EVERY line after the 14 is found until it finds a 17 so if 17 wasn't the 2nd field of the line after 14 is found (line 3 in the examples) but was the 2nd field of a line after that (e.g. line 4 in the examples) then the script would change THAT 17 to a 12.`t` would need to hold a line number or something - it's not clear from the question if that's needed for `r` or not too. – Ed Morton Feb 17 '21 at 17:08
Hmm my interpretation was to replace `17` in any of the next lines of `14`. May be that's not what OP meant – anubhava Feb 17 '21 at 17:12
2

Yeah, I do wish the OP had provided more useful sample input/output instead of just that one sunny day case - as seen with our recent discussion and discussions I had previously with @RavinderSingh13 there are a lot of edge cases to consider. – Ed Morton Feb 17 '21 at 17:13

AWK to search a specific sequence and if found search in the next line another sequence

3 Answers3