Extracting key word from a log line

Question

I have a log which got like this :

.....client connection.....remote=/xxx.xxx.xxx.xxx]].......

I need to extract all lines in the log which contain the above,and print just the ip after remote=.. This would be something in the pattern :

grep "client connection" xxx.log | sed -e ....

What have you tried? Most of us here are happy to help you improve your craft, but are less happy acting as short order unpaid programming staff. Show us your work so far in an [MCVE](http://stackoverflow.com/help/mcve), the result you were expecting and the results you got, and we'll help you figure it out. — ghoti, Dec 27 '16 at 05:52
Possible duplicate of [Extract pattern from a string](http://stackoverflow.com/questions/11533063/extract-pattern-from-a-string) — tripleee, Dec 27 '16 at 06:17
Users with a rep of approaching 2K should know by now not to [ask volunteers for urgency](http://meta.stackoverflow.com/q/326569/472495). — halfer, Dec 27 '16 at 09:12

score 1 · Answer 1 · answered Dec 27 '16 at 05:40

1

Using grep:

grep -oP '(?<=remote=/)[^\]]+' file

o is to extract only the pattern, instead of entire line. P is to match perl like regex. In this case, we are using "negative look behind". It will try to match set of characters which is not "]" which is preceeded by remote=/

answered Dec 27 '16 at 05:40

Guru

16,456
2
33
46

1

A more robust way would be to include the pattern `client connection` into `grep` as that is what OP needs. Your logic might also include lines that do not have `client connection` in the same line. – Inian Dec 27 '16 at 05:45

score 0 · Answer 2 · answered Dec 27 '16 at 05:23

Try this:

grep 'client connection' test.txt | awk -F'[/\\]]' '{print $2}'

Test case

test.txt
---------
abcd
.....client connection.....remote=/10.20.30.40]].......
abcs
.....client connection.....remote=/11.20.30.40]].......
.....client connection.....remote=/12.20.30.40]].......

Result

10.20.30.40
11.20.30.40
12.20.30.40

Explanation

grep will shortlist the results to only lines matching client connection. awk uses -F flag for delimiter to split text. We ask awk to use / and ] delimiters to split text. In order to use more than one delimiter, we place the delimiters in [ and ]. For example, to split text by = and :, we'd do [=:].

However, in our case, one of the delimiters is ] since my intent is to extract IP specifically from /x.x.x.x] by spitting the text with / and ]. So we escape it ]. The IP is the 2nd item from the splitting.

score 0 · Answer 3 · edited May 23 '17 at 12:00

A more robust way, improved over this answer would be to also use GNU grep in PCRE mode with -P for perl style regEx match, but matching both the patterns as suggested in the question.

grep -oP "client connection.*remote=/\K(\d{1,3}\.){3}\d{1,3}" file
10.20.30.40
11.20.30.40
12.20.30.40

Here, client connection.*remote matches both the patterns in the lines and extracts IP from the file. The \K is a PCRE syntax to ignore strings up to that point and print only the capture group following it.

(\d{1,3}\.){3}\d{1,3}

To match the IP i.e. 3 groups of digits separated by dots of length from 1 to 3 followed by 4th octet.

P.... · Accepted Answer · 2016-12-27T06:37:50.133

grep -oP 'client connection.*remote=/\K.*?(?=])' input

Prints anything between remote=/ and closest ] on the lines which contain client connection.

Or by using sed back referencing: Here the line is divided into three parts/groups which are later referred by \1 \2 or \3. Each group is enclosed by ( and ). Here IP address belongs to 2nd group, so whole line is replaced by 2nd group which is IP address.

sed -r  '/client connection/ s_(^.*remote=/)(.*?)]](.*)_\2_g' input

Or using awk :

awk -F'/|]]' '/client connection/{print $2}' input

Extracting key word from a log line

4 Answers4