1

EDIT - I have resolved this with a workaround but will attempt your suggestions as well to see which one I like better. I went through the "Extract new fields" process in Splunk and manually highlighted the data I want, then copied the auto-generated corresponding regex statement and used that directly

I am having some trouble matching patterns from a Splunk search string using the rex command and outputting them into the |table command. Regex101.com shows that this pattern matching is correct on both PCRE protocols. I am trying to match values within double quotes. The double quotes are escaped in the splunk query with slashes.

Source message with wildcards: "Error in breakfast table *, table name \"*\". The quick brown fox jumped over the lazy dog. The maximum length of the \"*\" data is currently set to * hotdogs, but the bun length is * inches. Increase the maximum length of the \"*\" bun to at least * inches and retry.*"

Asterisks surrounded in double quotes are simple strings: one letter, two letter, multi word etc, and standalone asterisks are just representing numbers.

This works: | rex "Error in breakfast table (?<breakfast_table>\d+)" | rename breakfast_table as "BT" This does not work: | rex "table name "(?<table_name>[^"]*)"" | rename table_name as "TN"

rex statements 1 and 4 correctly display the numbers when I view them in a table. rex statements 2 and 3 return NULL and don't display anything even though regex101 (and chatGPT for what it's worth) don't seem to have a problem with the regex I am using.

| search Message="Error in breakfast table *, table name \"*\". The quick brown fox jumped over the lazy dog. The maximum length of the \"*\" data is currently set to * hotdogs, but the bun length is * inches. Increase the maximum length of the \"*\" bun to at least * inches and retry.*"

| rex "Error in breakfast table (?<breakfast_table>\d+)" | rename breakfast_table as "BT"

| rex "table name \"(?<table_name>[^\"]*)\"" | rename table_name as "TN"
| rex "maximum length of the \"(?<max_bunlength>[^\"]*)\"" | rename max_bunlength as "MB"

| rex "data is currently set to (?<current_length>\d+)" | rename current_length as "Current Length"

 

I have confirmed on regex101.com that the regex patterns I have tested are matching on what I expect. I have trial and errored many different regex patterns on the splunk query directly to no avail.

Match 1 will capture the entire substring, and Group table_name will correctly capture just the value I want.

For example, a real message may insert the following values as the wildcard: "email to" or "message id" correctly, and when testing the regex used in this statement on regex101 -> | rex "maximum length of the "(?<max_bunlength>[^"]*)"" | rename max_bunlength as "MB" I will correctly see the following matches

Match 1: maximum length of the "email to" Group max_bunlength: email to

Yet this max_bunlength variable shows NULL values for every record in my display table

fewrgw5yu
  • 11
  • 2
  • The `max_bunlength` field is NULL because the field is renamed to `MB`. – RichG Apr 20 '23 at 21:16
  • 1
    why extract a field just to immediately rename it? Instead of extracting `max_bunlength` and renaming to `MB`, just extract `MB` – warren Apr 21 '23 at 14:25
  • @warren I renamed it for better readability when I use it in a |table statement at the very end of the query. – fewrgw5yu Apr 24 '23 at 13:54
  • @RichG I don't think so - other fields are renamed and populate data just fine. – fewrgw5yu Apr 24 '23 at 13:54
  • 1
    If the field is NULL when using the correct name then that means data was not extracted from the raw event into that field. Double-check the regular expression and any other commands used to populate the field. Consider sharing the full query. – RichG Apr 24 '23 at 14:04

1 Answers1

0

These rex commands should work:

| rex field=Message "rror in \w+\s\w+\s(?<error>[^,]+)"
| rex field=Message ", table name\s[^\"]+(?<table_name>[^,]+?)\"\."
| rex field=Message "The maximum[^\"]+\"(?<max_bun_length>[^\"]+)"
| rex field=Message "data is currently set to (?<current_length>\S+)"
warren
  • 32,620
  • 21
  • 85
  • 124