0

We have to model a regex in order to extract in Splunk (at index time) some fileds from our event. These fields will be used in search using the tstats command. The regex will be used in a configuration file in Splunk settings (transformation.conf).

The main aspect of the fields we want extract at index time is that they have the same json key but a different father json-key.

Is it possible modelling this extraction using regex?

This is an example of Splunk event having the structure described before (json by the way):

{
   "info":{
      "eventSource":"",
      "sourceType":"I/O",
      "status":{
         "code":"",
         "msg":"",
         "msgError":""
      },
      "transactionId":null,
      "traceId":null,
      "timestampStart":"2019-05-16T21:30:55.174Z",
      "timestampEnd":"2019-05-16T21:30:55.174Z",
      "companyIDCode":"",
      "channelIDCode":"",
      "branchCode":"",
      "searchFields":{
         "key_3":"value",
         "key_2":"value",
         "key_1":"value"
      },
      "annotation":{},
      "caller":{
         "id":"",
         "version":"",
         "acronym":""
      },
      "called":{
         "id":"",
         "version":"",
         "acronym":""
      },
         "storage":{
            "id":"",
            "start":"",
            "end":""
         }
      }
   },
   "headers":[],
   "payLoad":{
      "input":{
         "encoding":"1024",
         "ccsid":"1024",
         "data":"dati_in"
      },
      "output":{
         "encoding":"1024",
         "ccsid":"1024",
         "data":"dati_out"
      }
   }
}

The attended result is something like that:

  • calledid -> aaa
  • callerversion -> 1
  • callerid -> bbb

We tried something like that

[calledid]
REGEX = (?<=called).*"id":"(?P<calledid>.*?)(?=")
FORMAT = calledid::"$1"
WRITE_META = true

but it dowsn't work cause it matches until the last id he finds. Such as:

":{"id":"","version":"","acronym":""},"storage":{"id":"

Thanks in advance.

Pietro Fragnito
  • 327
  • 1
  • 4
  • 18
  • What programming language are you using? – MonkeyZeus Oct 30 '19 at 12:06
  • @MonkeyZeus do you mean the regex flavor? In that case we are trying JavaScript one (and testing the regex in https://regex101.com/). – Pietro Fragnito Oct 30 '19 at 12:16
  • 1
    If you're working with JSON then a JSON parser is the proper solution. Regex is incredibly ill-suited for parsing JSON. If you're just doing this as a learning exercise on regex101 then keep trying and you will quickly see why a JSON parser is the right answer. – MonkeyZeus Oct 30 '19 at 12:19
  • @MonkeyZeus this is not the case: the regex will be used in the configuration file of an enterprise solution (Splunk), not in a any language program. – Pietro Fragnito Oct 30 '19 at 13:39
  • @Toto the above reply is good also for you. – Pietro Fragnito Oct 30 '19 at 13:40
  • If you want to extract the fields at ingest time, Splunk can be configured to do so. Use INDEXED_EXTRACTIONS = JSON in props.conf. Alternatively, if you wish to parse this data at search time, you can use the `spath` command. Suggest you re-ask this question and provide the full event, and I can assist in more detail. – Simon Duff Oct 30 '19 at 22:54
  • @SimonDuff I'm newer to Splunk so sorry in advanced. For more detail we have to use the regex in the transformation.conf file in order to extract some fileds at index time and use those fileds in a search using tstats command (cause it offers faster searches). Btw I will edti the question! Thx in advance – Pietro Fragnito Oct 31 '19 at 08:07
  • @Toto pls can you remove the duplicated answer? – Pietro Fragnito Oct 31 '19 at 08:19
  • @SimonDuff I write another question as you suggest – Pietro Fragnito Oct 31 '19 at 08:32

0 Answers0