3

I've read through the man page of grep and tried few things, none of them worked, at least not for me.

I want to extract a good readable line while tailing a log. This is a generic line in a log file I want to beautify:

26 Jan 2018 08:32:29,309 [TEXT] (myService-0) long.text.I.dont.care.about.but.is.different.in.every.line: [OTHERTEXT] Text im actually interested in

What I want is this:

26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

I know that with grep -o -e ".*\[TEXT\]" I get the first part, and with grep -o -e "\[OTHERTEXT\].*", I get the second part.

But this is not displayed on one line, also not if I combine it into grep -o -e ".*\[TEXT\]" -e "\[OTHERTEXT\].*"

[TEXT] and [OTHERTEXT] always are there and are my 'separators', so can be used to support extracting the parts I need.

I initially thought I could use grep -o -e "(.*\[TEXT\]).*(\[OTHERTEXT\].*)" and then somehow use the matching groups $1 and $2, but either I don't see it or there is no way to do so.

Is there a way to achieve what I want?

Preferred is using grep (simply because I want to learn more about it), but if that is not possible then awk or sed are fine as well, it just has to be usable with a tail -f.

And I'm also open to other approaches to get to that point, so let me know what ways exist to get there.

Thanks, Tobias

ximarin
  • 401
  • 3
  • 14

5 Answers5

4

You can use sed:

sed -E 's/(\[TEXT]).*(\[OTHERTEXT])/\1 \2/' file.log

26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

This sed matches a pattern between [TEXT] and [OTHERTEXT] and captures them in 2 groups. In replacement it puts those markers back using back-refrences \1 \2

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    That works great, thanks! I just had to do minor changes: `sed -E "s/(.*\[TEXT\]).*(\[OTHERTEXT\].*)/\1 \2/"` – ximarin Jan 26 '18 at 09:56
  • Matching before and after test using `.*` is not really needed. Just `sed -E 's/(\[TEXT]).*(\[OTHERTEXT])/\1 \2/' ` should work fine as well. – anubhava Jan 26 '18 at 10:03
  • Based on James Browns answer, a shorter sed for this specific use case: `sed -E "s/\].*\[//"` – ximarin Jan 26 '18 at 10:06
  • Actually `sed -E "s/\].*\[//"` might work for this line but if there is another `[...]` before `[TEXT]` or after `[OTHERTEXT]` then it will fail. – anubhava Jan 26 '18 at 10:07
1

Using awk you could replace everything between ] and [ with ] [:

$ awk 'sub(/\].*\[/,"] [")' file
26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in
James Brown
  • 36,089
  • 7
  • 43
  • 59
0

pipe your grep into

<your grep> | sed "s/(myService-0).*[OTHERTEXT]/(myService-0)[OTHERTEXT]/"
melwil
  • 2,547
  • 1
  • 19
  • 34
developer
  • 690
  • 7
  • 16
  • `[OTHERTEXT]` is called a bracket expression that matches each character inside `[...]` individually. – anubhava Jan 26 '18 at 09:20
0

you can do that with perl

$ # note that this will print empty lines when no match is found
$ perl -lne 'print /(.*\[TEXT\] ).*(\[OTHERTEXT\].*)/' ip.txt
26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in
$ # you can avoid empty lines by checking for match first
$ perl -lne '/(.*\[TEXT\] ).*(\[OTHERTEXT\].*)/ && print $1,$2' ip.txt
26 Jan 2018 08:32:29,309 [TEXT] [OTHERTEXT] Text im actually interested in

since you are processing tail -f output, you might need buffering control, see How to 'grep' a continuous stream? for example

Sundeep
  • 23,246
  • 2
  • 28
  • 103
0

You probably need sed for doing what you want:

sed -E 's/(.*\[TEXT]).*(\[OTHERTEXT])/\1 \2/' 

But to answer to your question about how to show matches in grep, yes it is possible with the option -o. This option will show only matched parts of the matching line. Nevertheless, if you use

grep -o -e ".*\[TEXT\]" -e "\[OTHERTEXT\].*"

you will get your matched parts but in separate lines.

Another possibility could be to use look-ahead and look-behind expressions, but it cannot work in your case.

rools
  • 1,539
  • 12
  • 21