grep - show two parts of matching line

Question

I've read through the man page of grep and tried few things, none of them worked, at least not for me.

I want to extract a good readable line while tailing a log. This is a generic line in a log file I want to beautify:

26 Jan 2018 08:32:29,309 [TEXT] (myService-0) long.text.I.dont.care.about.but.is.different.in.every.line: [OTHERTEXT] Text im actually interested in

What I want is this:

26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

I know that with grep -o -e ".*\[TEXT\]" I get the first part, and with grep -o -e "\[OTHERTEXT\].*", I get the second part.

But this is not displayed on one line, also not if I combine it into grep -o -e ".*\[TEXT\]" -e "\[OTHERTEXT\].*"

[TEXT] and [OTHERTEXT] always are there and are my 'separators', so can be used to support extracting the parts I need.

I initially thought I could use grep -o -e "(.*\[TEXT\]).*(\[OTHERTEXT\].*)" and then somehow use the matching groups $1 and $2, but either I don't see it or there is no way to do so.

Is there a way to achieve what I want?

Preferred is using grep (simply because I want to learn more about it), but if that is not possible then awk or sed are fine as well, it just has to be usable with a tail -f.

And I'm also open to other approaches to get to that point, so let me know what ways exist to get there.

Thanks, Tobias

score 4 · Accepted Answer · answered Jan 26 '18 at 09:12

4

You can use sed:

sed -E 's/(\[TEXT]).*(\[OTHERTEXT])/\1 \2/' file.log

26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

This sed matches a pattern between [TEXT] and [OTHERTEXT] and captures them in 2 groups. In replacement it puts those markers back using back-refrences \1 \2

answered Jan 26 '18 at 09:12

anubhava

761,203
64
569
643

1

That works great, thanks! I just had to do minor changes: `sed -E "s/(.*\[TEXT\]).*(\[OTHERTEXT\].*)/\1 \2/"` – ximarin Jan 26 '18 at 09:56
Matching before and after test using `.*` is not really needed. Just `sed -E 's/(\[TEXT]).*(\[OTHERTEXT])/\1 \2/' ` should work fine as well. – anubhava Jan 26 '18 at 10:03
Based on James Browns answer, a shorter sed for this specific use case: `sed -E "s/\].*\[//"` – ximarin Jan 26 '18 at 10:06
Actually `sed -E "s/\].*\[//"` might work for this line but if there is another `[...]` before `[TEXT]` or after `[OTHERTEXT]` then it will fail. – anubhava Jan 26 '18 at 10:07

score 1 · Answer 2 · answered Jan 26 '18 at 09:56

1

Using awk you could replace everything between ] and [ with ] [:

$ awk 'sub(/\].*\[/,"] [")' file
26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

answered Jan 26 '18 at 09:56

James Brown

36,089
7
43
59

score 0 · Answer 3 · edited Jan 26 '18 at 09:12

0

pipe your grep into

<your grep> | sed "s/(myService-0).*[OTHERTEXT]/(myService-0)[OTHERTEXT]/"

edited Jan 26 '18 at 09:12

melwil

2,547
1
19
34

answered Jan 26 '18 at 09:11

developer

690
7
16

`[OTHERTEXT]` is called a bracket expression that matches each character inside `[...]` individually. – anubhava Jan 26 '18 at 09:20

score 0 · Answer 4 · answered Jan 26 '18 at 09:42

you can do that with perl

$ # note that this will print empty lines when no match is found
$ perl -lne 'print /(.*\[TEXT\] ).*(\[OTHERTEXT\].*)/' ip.txt
26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in
$ # you can avoid empty lines by checking for match first
$ perl -lne '/(.*\[TEXT\] ).*(\[OTHERTEXT\].*)/ && print $1,$2' ip.txt
26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

since you are processing tail -f output, you might need buffering control, see How to 'grep' a continuous stream? for example

score 0 · Answer 5 · answered Jan 26 '18 at 09:53

You probably need sed for doing what you want:

sed -E 's/(.*\[TEXT]).*(\[OTHERTEXT])/\1 \2/'

But to answer to your question about how to show matches in grep, yes it is possible with the option -o. This option will show only matched parts of the matching line. Nevertheless, if you use

grep -o -e ".*\[TEXT\]" -e "\[OTHERTEXT\].*"

you will get your matched parts but in separate lines.

Another possibility could be to use look-ahead and look-behind expressions, but it cannot work in your case.

grep - show two parts of matching line

5 Answers5