-1

In Bash, I want to get the Nth word of a string after a matching pattern with awk.

Example text:

hadf asdfi daf PATTERN asdf dsjk PRINT_THIS asdf adas
asdf sdf PATTERN asdf dasdf PRINT_THIS ads asdf PATTERN ads da PRINT_THIS
ads PATTERN ads da PRINT_THIS

Excepted output:

PRINT_THIS
PRINT_THIS
PRINT_THIS
PRINT_THIS

So if a pattern is found, the second word after the match should be output.

How can i do this?

BeatONE
  • 19
  • 2

2 Answers2

3

With GNU grep:

grep -oP '.*?\bPATTERN(?:\h+\H+){2}\h+\K\S+' file

Perl:

perl -lnE 'while (/.*?\bPATTERN(?:\h+\H+){2}\h+(\S+)/g) { say $1; }' file

Demo and explanation of regex

Or with awk:

awk '/PATTERN[[:blank:]]/{for(i=1;i<=NF-3;i++) if ($i ~ /^PATTERN$/) print $(i+3)}' file

All print:

PRINT_THIS
PRINT_THIS
PRINT_THIS
PRINT_THIS
dawg
  • 98,345
  • 23
  • 131
  • 206
0

So, should it be in Bash or with awk or grep? In Bash you can do the following:

while read -ra tokens; do
  for idx in "${!tokens[@]}"; do
    [[ "${tokens[idx]}" = 'PATTERN' ]] && printf '%s\n' "${tokens[idx + 3]}"
  done
done

In case the tokens between PATTERN and PRINT_THIS cannot contain another PATTERN, you could make it a bit more wannabe-efficient (and uglier), like this:

while read -ra tokens; do
  for ((idx = 0; idx < ${#tokens[@]}; ++idx)); do
    [[ "${tokens[idx]}" = 'PATTERN' ]] && printf '%s\n' "${tokens[idx += 3]}"
  done
done

Notice the += instead of +, as in “making loops hard to read 101”.

Last but not least, declare -i idx step would make it (even) a tiny bit more efficient.

Andrej Podzimek
  • 2,409
  • 9
  • 12