2

Consider the following log data,

{"level":"info","msg":"Enqueued a recurring task","time":"2022-04-01T18:46:22.854684-07:00","uuid":"4e9fc098-f611-4128-ae0b-2e6b9cb232c8"}
{"level":"info","msg":"Reported a result","path":"/results","time":"2022-04-01T18:46:22.955999-07:00","uuid":"4e9fc098-f611-4128-ae0b-2e6b9cb232c8"}
{"level":"info","msg":"Reported a result","path":"/results","time":"2022-04-01T18:46:23.056295-07:00","uuid":"4e9fc098-f611-4128-ae0b-2e6b9cb232c8"}
{"level":"info","msg":"Enqueued a recurring task","time":"2022-04-01T18:46:23.056376-07:00","uuid":"28e9bea9-5d0c-4dd5-af4f-c22944fc4fcd"}

It represents enqueuing a recurring task with a certain uuid, the results of which can be reported multiple times (or not at all) with the same uuid. I'm interested in determining the interval elapsed between when the task was enqueued and when the result was reported for the first time. So far, I can display the results this outer join table,

msg="Enqueued a recurring task" 
| join type=outer left=L right=R where L.uuid = R.uuid 
    [ search msg="Reported a result" | dedup uuid sortby +_time] 
| fillnull value="" 
| table _time, L.msg, L.uuid, R.msg, L.time, R.time

enter image description here

What I'd like for convenience is to add an additional column with the difference between R.time and L.time. As far as I can tell from how to calculate duration between two events Splunk, one way to do this is to use strptime to convert those time fields into time values and then determine their difference. However, the time of the events was already parsed when importing the data (as seen from the built-in _time field) so this seems inefficient to me. Is there a more efficient way?

Update

Even the 'inefficient' method using strptime is proving tricky; I've added eval fields t to each of the searches,

msg="Enqueued a recurring task" 
| eval t=strptime(time, "%Y-%m-%dT%H:%M:%S.%6N%:z") 
| join type=outer left=L right=R where L.uuid = R.uuid 
    [ search msg="Reported a result" 
    | dedup uuid sortby +_time 
    | eval t=strptime(time, "%Y-%m-%dT%H:%M:%S.%6N%:z")]
| fillnull value="" 
| table _time, L.msg, L.uuid, R.msg, L.time, R.time, L.t, R.t, R.t-L.t

and it appears to parse the time correctly, yet the difference R.t-L.t shows up empty in the table: enter image description here Any idea why it's not showing the difference?

Update 2

I've tried both of RichG's answers, but they both lead to the following error in Splunk:

Error in 'eval' command: Type checking failed. '-' only takes numbers.

(See screenshots below). The way I'm running Splunk is with a Docker container,

docker run -p 8000:8000 -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_PASSWORD=supersecret" --name splunk splunk/splunk:latest

enter image description here

The second one leads to the error

Update 3

I finally got this to work using RichG's updated answer, combining both an evaluated strftime field with a diff field where R.t and L.t are quoted:

msg="Enqueued a recurring task" 
| eval t=strptime(time, "%Y-%m-%dT%H:%M:%S.%6N%:z") 
| join type=outer left=L right=R where L.uuid = R.uuid 
    [ search msg="Reported a result" 
    | dedup uuid sortby +_time 
    | eval t=strptime(time, "%Y-%m-%dT%H:%M:%S.%6N%:z")]
| fillnull value="" 
| eval diff='R.t'-'L.t'
| table _time, L.msg, L.uuid, R.msg, L.time, R.time, L.t, R.t, diff

enter image description here

Kurt Peek
  • 52,165
  • 91
  • 301
  • 526

2 Answers2

2

The time difference is empty in the table because the table command does not perform arithmetic. You must calculate the difference in a separate eval and display the result in the table.

index=foo msg="Enqueued a recurring task" 
| eval t=strptime(time, "%Y-%m-%dT%H:%M:%S.%6N%:z") 
| join type=outer left=L right=R where L.uuid = R.uuid 
    [ search index=foo msg="Reported a result" 
    | dedup uuid sortby +_time 
    | eval t=strptime(time, "%Y-%m-%dT%H:%M:%S.%6N%:z")]
| fillnull value="" 
| eval diff='R.t' - 'L.t'
| table _time, L.msg, L.uuid, R.msg, L.time, R.time, L.t, R.t, diff

You should be able to use each event's _time field to avoid having to parse the time field.

index=foo msg="Enqueued a recurring task" 
| join type=outer left=L right=R where L.uuid = R.uuid 
    [ search index=foo msg="Reported a result" 
    | dedup uuid sortby +_time ]
| fillnull value="" 
| eval diff='R._time' - 'L._time'
| table _time, L.msg, L.uuid, R.msg, L.time, R.time, L.t, R.t, diff
RichG
  • 9,063
  • 2
  • 18
  • 29
  • I've actually tried this but I get a Splunk type check error, `'-' only takes numbers.`. This does surprise me because your answer is similar to https://community.splunk.com/t5/Splunk-Search/eval-Time-difference/m-p/109208, for example. Any idea what the (remaining) issue might be? – Kurt Peek Apr 02 '22 at 21:28
  • 1
    Try quoting the field names. `| eval diff='R._time' - 'L._time'` – RichG Apr 02 '22 at 23:56
  • That seems to get ride of the syntax error, but the `diff` column in the resulting table is empty. – Kurt Peek Apr 04 '22 at 03:25
  • 1
    I wonder if `eval` doesn't work right with `join`'s left and right options. Try renaming them. `| rename 'R._time' as R_time, 'L._time' as L_time | eval diff = R_time - L_time`. – RichG Apr 04 '22 at 16:00
  • As I've updated in the question, your first answer with `strptime` and quoted fields in the diff works! (I tried using `rename` without `strptime` as you suggested above, but that still gives rise to an empty `diff` column, so I still haven't managed to use the fact that Splunk already parsed the timestamps when it loaded the data, but at least it works). – Kurt Peek Apr 04 '22 at 16:13
1

Fairly confident something akin to this should work (without using join:

index=ndx sourtype=srctp uuid=* msg=*
| stats min(_time) as first_time max(_time) as last_time earliest(msg) as first_msg latest(msg) as last_msg by uuid
| eval diff_seconds=last_time-first_time
| eval first_time=strftime(first_time,"%c"), last_time=strftime(last_time,"%c")

This approach will presume that _time has been set properly in the sourcetype's props.conf, but if it has, this gets you what you're looking for in one pass.

warren
  • 32,620
  • 21
  • 85
  • 124
  • That works! The only thing is that it appears to assume that the only logs with a `msg` field are the ones I'm interested in, `Enqueued a recurring task` and `Reported a result`, respectively. In my real example, however, there are many other logs with a `msg` field and I would like to filter the results down to those values specifically. – Kurt Peek Apr 04 '22 at 15:48
  • @KurtPeek - then do something like `(msg="enqueued a recurring task" OR msg="reported a result")` instead :) – warren Apr 05 '22 at 14:57