-1

I am trying to grab a specific sub-string within a log message:

Example:

esx03.mrlab.local dfwpktlogs: 61283 INET match PASS domain-c7/1001 IN 52 TCP 192.168.50.124/60313->192.168.50.122/48002 SEW

What I am looking to capture is the string after the 10th whitespace and before the next / mark.

In the above example, I am trying to capture 192.168.50.124

This string may or may not have an IP address, but it will always follow the 10th space and precede a slash (/).

I have tried a few methods, but I cannot seem to figure out how to begin after the 10th whitespace.

Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
Buckwheattb
  • 181
  • 1
  • 2
  • 12
  • 2
    What is the programming language/tool? What were the "few methods" you tried? Please add a valid tag to the question and update it with your attempts. – Wiktor Stribiżew Apr 16 '20 at 16:04

1 Answers1

1

I think the following should work for you:

^([^\s]+\s){10}([^\/]+)\/.*$

This captures 192.168.50.124 in the second capture group.

Here's a link to a RegExr demo.

Explanation: breaking down the regex, term by term:

  • ^ matches the start of the line.
  • [^\s]+\s - [^\s] is a negated character class, it matches any character other than a whitespace character (\s). [^\s]+\s matches one or more non-whitespace characters followed by a whitespace character.
  • ([^\s]+\s){10} - this matches the pattern inside the parentheses 10 times, so this matches one or more non-whitespace character, followed by a whitespace character, ten times.
  • ([^\/]+) this matches one or more character that is not a forward slash. Note how we have to escape the forward slash (\/). Enclosing this in parentheses allows us to access it later as a capture group.
  • \/.* matches a forward slash followed by zero or more of any other character.
  • $ matches the end of the line.
simon-pearson
  • 1,601
  • 8
  • 10
  • OP's questions are mostly about MS SQL that does not support regex based extraction like this. – Wiktor Stribiżew Apr 16 '20 at 16:16
  • Where in his question does he mention MSSQL? – simon-pearson Apr 16 '20 at 16:17
  • Same as nowhere is it mentioned OP can use a JavaScript/PCRE regex syntax. – Wiktor Stribiżew Apr 16 '20 at 16:23
  • it is actually for a tool we use to extract fields...the system uses regular expressions, so no MSSQL... – Buckwheattb Apr 16 '20 at 16:31
  • actually, I am having difficulty with the application understanding the capture group...is there a regex expression that will return the needed string without it being grouped? – Buckwheattb Apr 16 '20 at 16:43
  • Impossible I'm afraid @Buckwheattb, the _only_ thing that we know about the text that you want to capture is its position in the text as a whole ("it will always follow the 10th space and precede a slash (/)"). Sorry. – simon-pearson Apr 16 '20 at 16:46
  • I found this alternative: ^(?:[^\s]+\s){10}([^\/]+)\/.*$ Adding the ?: to the first group negates it as a return group...that settled it down and grabbed the right value. – Buckwheattb Apr 16 '20 at 16:48