EDIT - I have resolved this with a workaround but will attempt your suggestions as well to see which one I like better. I went through the "Extract new fields" process in Splunk and manually highlighted the data I want, then copied the auto-generated corresponding regex statement and used that directly
I am having some trouble matching patterns from a Splunk search string using the rex command and outputting them into the |table command. Regex101.com shows that this pattern matching is correct on both PCRE protocols. I am trying to match values within double quotes. The double quotes are escaped in the splunk query with slashes.
Source message with wildcards: "Error in breakfast table *, table name \"*\". The quick brown fox jumped over the lazy dog. The maximum length of the \"*\" data is currently set to * hotdogs, but the bun length is * inches. Increase the maximum length of the \"*\" bun to at least * inches and retry.*"
Asterisks surrounded in double quotes are simple strings: one letter, two letter, multi word etc, and standalone asterisks are just representing numbers.
This works: | rex "Error in breakfast table (?<breakfast_table>\d+)" | rename breakfast_table as "BT" This does not work: | rex "table name "(?<table_name>[^"]*)"" | rename table_name as "TN"
rex statements 1 and 4 correctly display the numbers when I view them in a table. rex statements 2 and 3 return NULL and don't display anything even though regex101 (and chatGPT for what it's worth) don't seem to have a problem with the regex I am using.
| search Message="Error in breakfast table *, table name \"*\". The quick brown fox jumped over the lazy dog. The maximum length of the \"*\" data is currently set to * hotdogs, but the bun length is * inches. Increase the maximum length of the \"*\" bun to at least * inches and retry.*"
| rex "Error in breakfast table (?<breakfast_table>\d+)" | rename breakfast_table as "BT"
| rex "table name \"(?<table_name>[^\"]*)\"" | rename table_name as "TN"
| rex "maximum length of the \"(?<max_bunlength>[^\"]*)\"" | rename max_bunlength as "MB"
| rex "data is currently set to (?<current_length>\d+)" | rename current_length as "Current Length"
I have confirmed on regex101.com that the regex patterns I have tested are matching on what I expect. I have trial and errored many different regex patterns on the splunk query directly to no avail.
Match 1 will capture the entire substring, and Group table_name will correctly capture just the value I want.
For example, a real message may insert the following values as the wildcard: "email to" or "message id" correctly, and when testing the regex used in this statement on regex101 -> | rex "maximum length of the "(?<max_bunlength>[^"]*)"" | rename max_bunlength as "MB" I will correctly see the following matches
Match 1: maximum length of the "email to" Group max_bunlength: email to
Yet this max_bunlength variable shows NULL values for every record in my display table