Awk to print a single next word followin a pattern match

Question

This Q is a variation on the theme of printing something after a pattern.

There will be input lines with words. Some lines will match a pattern where the pattern will be one or multiple words separated by space. The pattern might have a leading/trailing space which needs to be obeyed. I need to print the word immediately following the match.

Example input

The quick brown fox jumps over the lazy dog

Pattern : "brown fox "
Desired output : jumps

The pattern will only occur once in the line. There will always be a word following the pattern. There will be lines without the pattern.

awk or sed would be nice.

Cheers.

EDIT :

I failed to ask the question properly. There will be one or more spaces between the pattern and the next word. This breaks Andre's proposal.

    % echo -e "The quick brown fox jumps over the lazy dog\n" | awk -F 'brown fox ' 'NF>1{ sub(/ .*/,"",$NF); print $NF }'
jumps
    % echo -e "The quick brown fox    jumps over the lazy dog\n" | awk -F 'brown fox ' 'NF>1{ sub(/ .*/,"",$NF); print $NF }'

Please read https://stackoverflow.com/questions/65621325/how-do-i-find-the-text-that-matches-a-pattern to understand why it matters and then replace the word "pattern" with "regexp" or "string" everywhere it occurs in your question and add the information on how partial matches should be handled (e.g. does `own fox` match `brown fox`?). Once you do that we can help you come up with the right solution for your problem. — Ed Morton, Jan 20 '21 at 04:25

Andre Wildberg · Answer 1 · 2021-01-20T03:01:28.200

1

This works, given that the desired word is followed by a space:

$ echo -e "The quick brown fox jumps over the lazy dog\n" > file

$ awk -F 'brown fox ' 'NF>1{ sub(/ .*/,"",$NF); print $NF }' file
jumps

Edit: If there're more spaces use this:

$ awk -F 'brown fox' 'NF>1{ sub(/^ */,"",$NF);
                            sub(/ .*/,"",$NF); print $NF }' file

edited Jan 20 '21 at 03:01

answered Jan 20 '21 at 02:29

Andre Wildberg

12,344
3
12
29

dawg · Answer 2 · 2021-01-21T13:39:02.083

1

With GNU grep:

$ grep -oP '(?<=brown fox )(\w+)' file
jumps

If you have more than 1 space after the match:

$ echo 'The quick brown fox     jumps over the lazy dog' | grep -oP '(?<=\bbrown fox\b)\s+\K(\w+)'
jumps

Perl, with the same regex:

$ perl -lnE 'print $1 if /(?<=\bbrown fox )(\w+)/' file

Or, if you have multiple spaces:

$ perl -lnE 'print $1 if /(?<=brown fox)\s+(\w+)/' file

(As stated in comments, both the GNU grep and Perl regex could be \bbrown\h+fox\h+\K\w+ which has the advantage of supporting multiple spaces between brown and fox)

With awk, you can split on the string and split the result (this works as-is for multi spaces):

pat='brown fox'
awk -v pat="$pat" 'index($0, pat){
            split($0,arr, pat)
            split(arr[2], arr2)
            print arr2[1]}' file

edited Jan 21 '21 at 13:39

answered Jan 20 '21 at 02:47

dawg

98,345
23
131
206

Unfortunately with multiple spaces between pattern and capture word this is broken. – Gert Gottschalk Jan 20 '21 at 02:57
Easy fix. Edit made. – dawg Jan 20 '21 at 02:58
I only tested the grep version. It is OK. – Gert Gottschalk Jan 20 '21 at 03:50
1

(The perl compatible pattern for grep and perl can be `\bbrown\h+fox\h+\K\w+` without the lookbehind) – The fourth bird Jan 21 '21 at 09:33

score 1 · Answer 3 · answered Jan 20 '21 at 10:11

Disclaimer: this solution assumes that if no pattern is found (There will be lines without the pattern.) it is appropriate to print empty line, if this does not hold true ignore this answer entirely.

I would use AWK for this following way, let file.txt content be

The quick brown fox jumps over the lazy dog
No animals in this line
The quick brown fox   jumps over the lazy dog

then

awk 'BEGIN{FS="brown fox  *"}{sub(/ .*/,"",$2);print $2}' file.txt

output

jumps

jumps

Explanation: I set field seperator FS to "brown fox " followed by any numbers of spaces. What is after this will appear in 2nd column, I jettison from 2nd column anything which is after first space including said space, then print that column. In case there is no match, second column is empty and these actions result in empty line.

score 0 · Answer 4 · answered Jan 21 '21 at 09:57

With GNU awk, you might also use a capture group with function match.

\ybrown\s+fox\s+(\w+)

\y A word boundary
brown\s+ Match brown and 1+ whitespace chars
fox\s+ Match fox and 1+ whitespace chars
(\w+) Capture 1+ word chars in group 1

In awk, get the group 1 value using arr[1]

Example

echo "The quick brown fox jumps over the lazy dog" |
awk 'match($0,/\ybrown\s+fox\s+(\w+)/, arr) {print arr[1]}'

Output

jumps

See a bash demo

Awk to print a single next word followin a pattern match

4 Answers4